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ABSTRACT 



This thesis examines two smoothing algorithms which deviate from 
the classical method of using only one neighborhood size in the 
smoothing procedure. The Supersmooth algorithm uses three neighbor- 
hood sizes with local cross-validation in order to estimate an optimal 
neighborhood size. The Split Linear Fit algorithm uses any number of 
neighborhood sizes and computes a family of linear fits corresponding to 
each neighborhood size; the final smooth points are a weighted average 
of the linear fits. These two advanced smoothers are evaluated against 
the results produced by previously validated, commonly used smoothers 
and regression techniques. The measure of performance is the quality 
of the smooth curves and the value of the sum of squared residuals. 



THESIS DISCLAIMER 



The reader is cautioned that computer programs developed in this 
research may not have been exercised for all cases of interest. While 
every effort has been made, within the time available, to ensure that 
the programs are free of computational and logic errors, they cannot be 
considered validated. Any application of these programs without addi- 
tional verification is at the risk of the user. 
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I. INTRODUCTION 



A. BACKGROUND 

"It is a well-established rule of scientific investigation that the first 
time an experiment is performed the results bear all too little resemb- 
lance to the 'truth' being sought" [Ref. 1: p. 1], The experiment may 
be the simple task of data collection, i.e. survey, or a process of 
generating data. The analyst may have a small set or a large, set of 
data or a series of observations which must be analyzed. After some 
data analysis, the analyst may extract quantities relevant to purposes 
that he/she has in mind for further analysis. This analysis and data 
extraction process has the formal name of data reduction. Tukey calls 
this process "exploratory data analysis" [Ref. 2: p. 1]. 

There are several statistical methods that can facilitate the data 
reduction process. The quote, "a picture says a thousand words," 
suggests that the data analysis involves pictorial representations of the 
observed data. The single, most powerful statistical tool is a "well- 
chosen graph" [Ref. 3: p. 1]. A well-chosen graph enables salient 

features of a data set to be picked out and vividly portrayed so that 
the analyst can spot the features of particular interest [Ref. 4: p. 41]. 

The data set is very often bivariate data, i.e. pairs of values 
(X^, Y^), . . ., (X^, Y^), where it is conventional that the Yj, 

called the ordinate, be a function of the corresponding Xj, called the 
abscissa. The abscissa indicates a specific snapshot of time or is the 
input to an experiment, i.e. the value of an independent variable. The 
analysis of the data basically concentrates on finding a relationship 
between the Xj and the Yj. The single most powerful statistical tool 
for analyzing the relationship between the Xj and the Yj is the scatter- 
plot [Ref. 3: p. 75]. A scatterplot is a two-dimensional graph which 
visually displays the relationship of the pairs of Xj and Yj. The 
vertical axis of the scatterplot represents the scale values of the 
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ordinate or Yj, and the horizontal axis represents the scale values of 
the abscissa or Xj, A scatterplot is easily accepted by the human 

brain which quickly summarizes the depicted information and extracts 
the salient features, patterns, and relationships that are not detected 
with other data analytical methods, e.g. tabulated data. Figure 1.1 is 
an example of a scatterplot displaying the Daily Sea-Surface 

Temperature for 1971 at Granite Canyon, just South of Point Sur, 
California [Ref. 4]. This sea-surface temperature data for 1971 is 
given in tabular form in Appendix D. 



DAILY SEA SURFACE TEMPERATURE 
AT GRANITE CANYON, CA. 



z 2 

UJ 

<£ 



e 




MARCH 1, 1971 TO DECEMBER 31. 1971 




1060 1110 1160 1210 1260 1310 1360 

JULIAN CALENDAR DATE 



Figure 1.1 Scatterplot of Sea Surface Temperatures for 1971. 



The scatterplot shown in Figure 1.1 is more compact and informative 
than the corresponding tabulated data in Appendix D. The scatterplot 
indicates that the sea-surface temperature varies with the time of year, 
i.e. general temperatures increase during the summer and decrease 
during the fall. There may have been other extraneous factors that 
affected the temperatures, e.g. the warm ocean current El Nino, an 
intra-yearly occurrence which sometimes causes great climatic turbulence 
all over the world, could be the cause of the great temperature varia- 
tions shown by the scatterplot in Figure 1.1. It is very difficult to 
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DAILY SEA SURFACE TEMPERATURE AT GRANITE CANYON. CA. 
MARCH 1, 1971 TO OECEMBER 31. 1971 




Figure 1.2 Smoothed Sea Surface Temperatures for 1971. 

interpret the function imbeded in the scatterplot of Figure 1.1. An 
attempt to sketch a rough line that follows the curvature of the points 
may result in a tenuous and perhaps incorrect line in terms of depicting 
the variability such as the periodicity in the data set. The sketching 
of the line through the scatterplot of Figure 1.1 would take time and 
involve strong subjective decisions. The result could be a misinterpre- 
tation of the scatterplot/data. A more effective and substantiated 
method of data reduction is "smoothing", see Figure 1.2. This scatter- 
plot with a smooth curve through the raw data is more acceptable to the 
human eye than a plain, data scatterplot and fairly well approximates 
the raw data. A cyclic change of the sea surface temperature is 
emphasized by the smooth curve. In addition, Figure 1.2 depicts that 
a cycle with roughly a monthly period could exist in the data, i.e. 
there are twelve peaks shown by the smooth curve . 

Smoothing can be used on data sets whose scatterplots indicate an 
underlying relationship that is either a simple linear function or a 
complex sinusoid function. Smoothing has in the recent past years 
become a useful data reduction technique. Banks, insurance companies, 
and industrial firms smooth economic surveys [Ref. 5: p. 1]. 
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The government does smoothing on income data such as tax payments, 
salaries and benefits to civil servants, and social services costs 
[Ref. 6], The space program does smoothing of test flight paths, fuel 
usage, and orbiting ejection path data [Ref. 7] . Conferences have had 
smoothing as the sole subject of discussion [Ref. 8]. 

The smoothing algorithm that is used to smooth a data set must use 
a procedure that is flexible enough to discover trends in the data, i.e. 
be able to accurately trace the observed data and respond to local 
changes. Therefore, the algorithm should use local smoothing rather 
than global smoothing which is used in linear regression and curve 
fitting. This procedure allows the observed data to determine the 
shape of the smooth curve. 

An advanced smoothing algorithm must be more computationally effi- 
cient and more user friendly than most current smoothing algorithms. 
In addition an advanced smoothing algorithm must be able to correctly 
extract the underlying function from the observed data. 

B. SCOPE 

This paper discusses and analyzes two advanced smoothing algo- 
rithms, the Supersmoother algorithm [Ref. 9] and the Split Linear Fit 
Algorithm [Ref. 10] . These two smoothing algorithm were developed at 
Stanford University and thus have many similarities. The basic concept 
used in these algorithms is that the underlying function is thought of 
as a low frequency signal; therefore, the observed raw data is the 
signal plus noise. Thus, the smoother is analogous to a low-pass filter 
which is designed to compromise between the signal extracted, i.e. 
desirable effects, and the noise filtered out, i.e. undesirable effects 
[Ref. 10: p. 1], Equation 1.1 shown below is a generalization of the 
low-pass filter: 

Y,= /(Y,) + n , (1-1) 
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where Yj is the observed value, f(Yj) is the smooth function or 
extracted signal, and rj is the additive residual or noise filtered out. 
It is initially assumed that the set of Yj is an independent and identi- 
cally distributed (i.i.d.) random sample from some unknown joint distri- 
bution F(X, Y) . It is also sometimes assumed that the rj are i.i.d. 

9 

with zero expectation and constant variance <j , but possibly correlated; 

thus the notation follows the convention set by time series theory 
[Ref. 3: p 246]. The computed smooth values are estimates of the 
smooth function f(Yj). It is best that the smooth point values be 
computed using local averaging [Ref. 10: p. 3], in other words the I** 1 
smooth point value is the average of the Y values corresponding to the 
X values within a neighborhood of size K about Xj, where a neighbor- 
hood of size K about Xj will have (K/2) point values to the right and 
left. Equation 1.2 shown below states this averaging procedure in 
conditional form, indicating that only the Y values that correspond to 
the X values within the neighborhood K about Xj are involved in the 
averaging. 

s(Xj) = average(Yj given Xj a member of the neighborhood K^, (1.2) 

where s(Xj) is the computed smooth point value corresponding to Xj, 
Kj is the neighborhood size corresponding to Xj, J=l, . . . , Kj is the 
J** 1 member of the neighborhood of size Kj, and 1=1, . . . , N is the 
index of the N points to be smoothed. For the simple, equal-weight, 
moving average smoother, the smooth value at point Xj is computed by 
equation 1.3: 

/+ f 

s ( x >) = y x £ Yj • (1.3) 

2 

where K is the neighborhood size and may encompass a fraction of the 
data set to be smoothed or the entire data set. By looking at equation 
1.3, it can be deduced that when 1=1, . . ., (K-l) and I=(N-K-1), . . 
., N the subscript of Y is negative and has no corresponding Y values. 
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Most simple moving average smoothers do not involve the latter 
mentioned index values and begin the averaging with I=(K/2) and end 
the averaging with I=N-(K/2); thus, the smooth output will have less 
values than N, exactly K less values. 

The neighborhood size, denoted above by K, referred to later in 
this thesis as bandwidth, span, or windowsize, is a critical value which 
must be chosen carefully because it determines to a great degree, the 
goodness of fit of the smooth curve to the raw data. For example, with 
the equal- weight, moving average smoother, a large neighborhood size 
results in the loss of many smooth point values, and thus, the raw data 
is not well depicted. A commonly used measure of goodness of fit is 
the sum of squared residuals; thus, it is necessary to examine a 
squared residual value in general terms. If the output of the smooth 
function f(Xj) is accepted as an estimate of the corresponding Yj and a 
linear fit is done on the points within the neighborhood, then the 
expected squared residual at point Xj, given a neighborhood size K, 
may be determined by equation 1.4: 



r*(X, I K) 



/*— 



12 



/ (V,) - 



TT 

2 



/ (Yj) 



1 2 
K X<7 



(1.4) 



The term within the brackets is the bias component of the estimated 
residual value corresponding to X j ; in other words, the degree to 
which the smooth point value deviates from the actual point value. The 
second term is the variance component which indicates that the assumed 
inherent constant variance of the residuals must be equally shared by 
the estimated residuals within the neighborhood. Increasing the neigh- 
borhood size, K, increases the bias and decreases the variance, thus a 
plot of the smooth values will get smoother as K is increased. 
Decreasing K will have the opposite effect. 



Most smoothing algorithms use only one neighborhood size to 
produce the smooth values, i.e. the same K for all Xj in equations 1.3 
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and 1.4. The problem with this method is that the smoothing program 
may have to be run several times with each run containing a different 
K before the desired smoothing effect is produced. This thesis 
discusses two advanced smoothing algorithms which deviate from this 
procedure. The two advanced smoothing algorithms are: 

1. the Supersmoother algorithm developed by Friedman and Stutzle 
[Ref. 9]; 

2. the Split Linear Fit algorithm developed by McDonald and Owen 
[Ref. 10]. 

The Supersmoother requires that the user enter three different 
neighborhood sizes, SPAN-p SPAN 2 , and SPAN 3 , in increasing order. 
Each span value determines a neighborhood size about each Xj on which 
a linear regression is done. Therefore, three sets of regression results 
will correspond to each Xj. Each of the three slope values, the three 
corresponding y-intercept value, and the corresponding Xj are used to 
compute three fitted values. Each fitted value is subtracted from the 
input Yj value corresponding to Xj; the resulting values are called 
cross- validated residuals. The minimum, absolute value of these cross- 
validated residuals is then selected along with its span value. This 
span value is an estimate of the optimal span value corresponding to 
Xj . This estimate is then adjusted using an outlier rejection rule which 
will reflect the degree of robust smoothing desired by the user. The 
smallest span value, SPAN^, and the largest span value, SPAN 3 , dictate 
the range within which Supersmoother finds the optimal span value. 
The middle span value, SPAN 2 , is used as a central smoother, i.e. by 
smoothing the array of optimal span values with the middle span value 
the variability is reduced. This smoothing adjusts the span values so 
that the values flow smoothly from one point to the next adjacent point. 
This method of finding the optimal span values is called local cross- 
validation [Ref. 9: p. 1 ]. The method of cross-validation is a testing 
procedure that uses the estimated regression equation on data different 
than the data used to estimate the coefficients of the estimated regres- 
sion equation [Ref. 11: p. 110]. 
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The Split Linear Fit algorithm uses one or more neighborhoods, 
called window sizes, in order to produce for each I a family of linear 
fitted values. Weights which indicate the goodness of fit of the fitted 
values are then assigned to each of these linear fitted values. The 
final smooth value is computed as a weighted average of the linear 
fits within the I** 1 family of linear fits. 

This technique of using more than one neighborhood size allows the 
analyst to set upper and lower limits on the neighborhood size. By 
accepting more than one neighborhood size, these advanced smoothers 
take full advantage of the powerful computational capabilities of a 
computer and thus are quicker and more efficient than other smoothers, 
i.e. desired smoothing effects are achieved in less runs of a smoothing 
program. 

The purpose of this thesis is to expand the data smoothing subrou- 
tine developed by Friedman and Stuetzle [Ref. 9] and the smoothing 
program developed by McDonald and Owen [Ref. 10] into user friendly, 
interactive computer programs, i.e. the user exchanges information with 
the computer, that can be used as an exploratory data analytical tool 
by students and faculty of the Naval Postgraduate School. 

The Supersmoother algorithm was written as a FORTRAN subroutine 
and has been incorporated into an interactive FORTRAN program. The 
Split Linear Fit algorithm was part of a data smoothing package written 
in the C computer language, which is not a common computer language 
used at the Naval Postgraduate School. The Split Linear Fit algorithm 
has been translated and is incorporated into an interactive FORTRAN 
program. The point values produced by the Split Linear Fit FORTRAN 
version are equivalent to the point values produced by the C language 
version. Both the Supersmoother and the Split Linear Fit algorithms 
are written in FORTRAN 77 for use on the IBM 3033 computer being 
used at the Naval Postgraduate School. SUPSMO is the Supersmoother 
program and SPLITSMO is the Split Linear Fit program. These two 
FORTRAN programs are designed to produce output in any one of 
following three forms: 
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1. a CMS data file; 

2. an APL, "A Programming Language," variable; 

3. graphs produced with the IBM GRAFSTAT* statistical graphics 
package [Ref. 12]. 

These programs are written for use by any individual who has access to 
the IBM 3033. With simple commands the user can create or access an 
APL workspace and create an APL variable that stores the smooth 
output. Access to the GRAFSTAT graphics package is easy and done 
without exiting the smoothing program. Creation of a CMS file is even 
easier. GRAFSTAT is a graphics package which is an experimental 
program available at the Naval Postgraduate [Ref. 12]. 

Complete user instructions on how to use SUPSMO and SPLITSMO 
are available in Chapter VI and VII. Mathematical details on the the 
Supersmoother and the Split Linear Fit are presented in Chapters II 
and III, respectively. In Chapter IV are the evaluation results from 
smoothing three simple sets of data with these two advanced smoothers. 
These smoothing results are compared to the smoothing produced by 
previously verified smoothers, e.g. LOWESS and Moving Average. In 
Chapter V a real application of the Supersmoother and the Split Linear 
Fit programs is presented. The Granite Canyon Daily Sea-Surface 
Temperature data for the period of March 1971 to February 1983 is used 
in the analysis presented in Chapter V. This data set is used because 
of the large size of the series, 4380 points; because the variance may 
not be constant, and because the complex underlying function seems to 
contain some periodicity. 



^GRAFSTAT is an experimental APL package from IBM which the 
Naval Postgraduate School is using under an agreement with the IBM 
Research Center, Yorkstown Heights, N. Y. 
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II. TECHNICAL DESCRIPTION OF SUPERSMOOTHER ALGORITHM 



A. OVERVIEW 

The data smoothing algorithm. Supersmoother, was developed at 
Stanford University by Jerome H. Friedman and Werner Stuetzle 
[Ref. 9]. The smoothing technique uses local averaging [Ref. 9: p. 
3], local linear fitting [Ref. 9: p. 3], and selection of a local optimal 
span [Ref. 9: p. 8], i.e. application of method of cross-validation 

[Ref. 9: p. 1], The developers claim that Supersmoother is "both very 
flexible and rapidly computable" [Ref. 9: p. 3]. One of the features 
which makes Supersmoother flexible is that Supersmoother is scale inde- 
pendent. In other words, the X values must be equi-spaced but can 
belong to the interval (0.0, 1.0] or the interval [1.0, 2.0, 3.0, . . ., 
N] , where N is the number of point values to be smoothed, while the Y 
values must be real values and need not be equi-spaced. Another 

feature which makes Supersmoother flexible is that there is an option of 
entering one or three global span values where these values are entered 
as a ratio of the span to the number of points to be smoothed. 
Another flexibility feature is that there is an outlier rejection rule 
which allows the user to adjust the degree of robustness using an index 
within the interval [0.0, 10.0], where 0.0 indicates robust smoothing 

and 10.0 indicates non-robust smoothing. Supersmoother uses a small 
amount of computer time and of storage space by using computation and 
data storage procedures commonly used in dynamic programming, i.e. 
Fj_^(X) is used to update Fj(X) and only the new value is stored. 

The objective of Supersmoother is to efficiently smooth a scatterplot 
[Ref. 9: p. 1]. Supersmoother consists of two subroutines, the 

Combining Subroutine and the Smoothing Subroutine, see Figure 2.1. 
The Combining Subroutine and the Smoothing Subroutine exchange data 
arrays once if only one span value is used and eight times if three 
span values are used. 
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Figure 2.1 Supersmoother Subroutines. 

Figure 2.2 shows the flow of data within the Supersmoother algo- 
rithm. The Combining Subroutine receives the data to be smoothed and 
other pertinent parameters and sets up the data for transmission to the 
Smoothing Subroutine. The Smoothing Subroutine smoothes the data 
array three times, using each span value once, and then computes the 
residual values corresponding to the three smoothed arrays. Then each 
array of residual values is smoothed using SPAN 2 in order to reduce 
the total variability and create smooth transitions between adjacent resi- 
dual values. The smoothed residual values are then returned to the 
Combining Subroutine where the optimal span values are determined and 
adjusted using the outlier rejection rule. The adjusted optimal span 
values are then sent back to the Smoothing Subroutine for smoothing 
with SPAN 2 . This is done so that variability between the values will 
again be reduced. The now smoothed, adjusted, optimal span values 
are returned to the Combining Subroutine where they are used in an 
interpolation procedure. The results of this interpolation procedure are 
estimates of the final smoothed values . These estimated smoothed 
values are then returned to the Smoothing Subroutine for smoothing 
with SPAN-^ in order to reduce the variability of these values. This 
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procedure will also accentuate outliers in the raw data because of the 
small neighborhood size of SPAN^. The final smoothed values are 
returned to the Combining Subroutine which forwards the results to the 
user’s main program for his/her use. 

The Combining Subroutine does the following: 

1. keeps track of pertinent computational results; 

2. computes the interquartile range of the abscissa; 

3. defines zero for computational and comparative purposes; 

4. determines the optimal span corresponding to each abscissa; 

5. applies the outlier rejection rule in order to adjust the 
robustness; 

6. estimates the smoothed output. 

If only one span value is used, only the first three items of the above 
list are executed by the Combining Subroutine, and the Smoothing 

Subroutine is used only once. If three span values are used, all the 
items are executed. 

The smoothed output produced by the Smoothing Subroutine is not 
the final smoothed values given to the user. Therefore, to be able to 
distinguish the output forwarded to the user, i.e. the smoothed Y 
values, from the smoothed values exchanged between the subroutines, 
any array to be smoothed by the Smoothing Subroutine, e.g. the resi- 
dual values, will be called Z within the Smoothing Subroutine. Z. 

After the array is smoothed and returned to the Combining Subroutine, 
it regains it's usual name. The Smoothing Subroutine does the 
following: 

1. computes the neighborhood size, (IT), the number of points to 
be included in the local averaging; 

2. computes the base mean, variance, and covariance values that 
will be used in the computation of the smoothed values Zj; 

3. computes the smoothed values Zj at the beginning of the data 

array, i.e. the first (IT/2) smoothed points that are not 

usually computed by most smoothers; 
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Figure 2.2 Data Flow of Supersmoother. 
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4. 


computes 

array; 


the 


smoothed 


values Zj for the middle 


of the data 


5. 


computes 


the 


smoothed 


values Zj for the end of 


the data set. 




i.e. the 


last 


(IT/2) 


smoothed points that are 


not usually 



computed by most smoothers. 

The smoothed values Zj are the result of a local linear fit within the 
neighborhood of points about Xj, the abscissa corresponding to the 
smoothed Zj. The Smoothing Subroutine computes the cross- validated 
residuals only at the time when the input data is smoothed, see Figure 
2 . 2 . 

B. MATHEMATICAL DETAILS — COMBINING SUBROUTINE 

The Combining Subroutine requires the following user input: 

1. N, the number of points to be smoothed; 

2 . Y^, . . ., the point values that need smoothing; 

3. X^, . . ., X^, the abscissa corresponding to the Yj values if 

the abscissa do not belong to the interval [ 1 . 0 , 2 . 0 , . . ., 

N] ; 

4. IPER, equals 1 or 2, to indicate that the abscissa belong to the 

interval [1.0, 2.0, . . . , N] or the interval (0.0, 1.0], 

respectively; 

5. the span values SPAN-p SPAN 2 , and SPAN 3 ; 

6 . ALPHA, the outlier rejection rule index. 

The Smoothing Subroutine assumes that the input data set is in chrono- 
logical order , i.e. Yj occurred before Yj + ^ where 1 = 1 , . . ., (N-l); 

thus, the abscissa will be in increasing order. 

The abscissa interquartile range, SCALE, is computed using equa- 
tions 2.1 through 2.3: 



I = — ; 


(2.1) 


4 




J = 3x1 ; 


(2.2) 
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SCALE = Xj - X,. . 



(2.3) 



If N<4 and if the computer allows indices with value of zero, then 
SCALE=0.0, otherwise an error is created. In order to define zero, 

VSMLSQ, SCALE must be greater than zero. If SCALElO.O, then J=J+1 
and SCALE is recomputed using equation 2.3. Zero is defined by 

equation 2.4: 



VSMLSQ = [(lx 10 _s )x SCALE] 2 . (2.4) 

If only one span value is used, the Smoothing Subroutine is called 
by the Combining Subroutine, and the smoothed data array returned is 
the smoothed Yj, see Figure 2.1. If this procedure is used, the 
smoothed Yj will have too much variability [Ref. 9: p. 9] and will be 
very robust, so it is best to use three span values. 

When three span values are used, the input Yj are smoothed three 
times, once with each span value, see Figure 2.2. For ease of discus- 
sion, Yjg will be used to indicate the I** 1 input Y value smoothed using 
SPANg, where 1=1, . . . , N and S=l, 2, 3. As mentioned before, 
during the smoothing of the input Yj, cross -validated residual values 
are computed. These residual values will be identified by ACVRjg, i.e. 
the I** 1 cross-validated residual computed when SPANg was used. In 
order to reduce the variability of the smoothed Yj, the ACVRjg are 
smoothed using SPAN 2 , see Figure 2.2. For stability reasons an array 
containing the absolute value of the ACVRjg is smoothed. [Ref. 9: p. 
9]. After the smoothing of the absolute value of the ACVRjg, each 
abscissa Xj, has the following seven corresponding values: 



1. 


the 


input Yj; 


2. 


Y I1 


and ACVRjg ; 


3. 


Y I2 


and ACVRj 2 J 


4. 


Y I3 


and ACVRjg . 



Next follows the basis of the local cross-validation method. First 
for each I the minimum of ACVRjg, ACVRj 2 , and ACVRj is selected and 
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designated ACVR^j^. Recalling that the second subscript of ACVRjg 
indicates the SPANg used to smooth the input data set and produce the 
corresponding ACVRjg, then the SPANg used to produce ACVR^^ can 
be determined and designated SCj. Then the outlier rejection which 
consists of equations 2.5 and 2.6, shown below, can be used to adjust 
SCj, i.e. the span value used to compute ACVR m | n , in order to reflect 
the degree of robustness desired by the user: 

SC, = SC, + (SPAN, - SC,) x a m 10 °- alpha ; (2.5) 

where 



AM = ABS 






max 



1.0x10 



_ T ACVR min 






ACVRj, 



J 



( 2 . 6 ) 



The resulting SCj is called the "estimated optimal span" [Ref. 9: p. 10] 
corresponding to I. The set of estimated optimal spans may have an 
unnecessarily high variance, thus they are smoothed using SPAN 2 ; the 
result is the set of optimal spans, SCj. 

Each SCj value is checked using one of the two following logical 
statements in order to verify that the span boundaries fixed by the 
user are not violated: 

1. if SCj^SPANj, then SCj=SPAN 1 or; 

2. if SCj>SPAN 3 , then SCj=SPAN 3 . 

Each SCj value is used to estimate a smooth Yj value by interpolating 
between two of the Yjg values previously computed. The sign and 
value of F in equation 2.7 forms the basis of the interpolation. 

F = SC, - SPAN, . (2.7) 

If F is negative then equations 2.8 and 2.9, shown below, are used to 
estimate the smooth Yj; 



F = 



-F 

SPAN, - SPAN, 



( 2 . 8 ) 
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estimated smooth Yj = [(1.0 - F)xY 2 ] t [FxYu] , 



(2.9) 



otherwise equations 2.10 and 2.11, shown below, are used: 



F = 



F 

SPAN. - SPAN 2 1 



( 2 . 10 ) 



estimated smooth Yj = [(1.0 - F)xY 1 2 ] — [FxY ls [ . ' " ' 

The final smooth Yj values are obtained by smoothing the estimated 
smooth Yj using SPANj. This smoothing is done in order to reduce the 
variability of the estimated smooth Yj caused by the variance in the 
input data. 

C. MATHEMATICAL DETAILS — SMOOTHING SUBROUTINE, PRIMARY 
USE 

The primary use of the Smoothing Subroutine is to smooth data with 
abscissa values in the interval [1.0, 2.0, . . ., N] . The secondary 

use of the Smoothing Subroutine is to smooth data with abscissa values 
in the interval (0.0, 1.0]. The Smoothing Subroutine requires that the 
following data be transferred from the Combining Subroutine: 

1. N, the number of points to be smoothed; 

2. the array to be smoothed, in this subroutine this array will be 
referred to as Zj, . . ., 

3. Xj, . . ., Xjj, the abscissa that correspond to the Zj; 

4. SPAN, the span value; 

5. a flag, IPER, which indicates whether the cross -validated resi- 
duals are to be computed or not computed; 

6. VSMLSQ, the defined value of zero. 

The size of the neighborhood of points included in the local aver- 
aging is determined by SPAN. Most smoothers will not compute the size 
of the neighborhood and require that the user enter an odd integer 
number indicating the size of the neighborhood. Supersmoother will 
compute the size of the neighborhood, thus allowing the user an infinite 
number of choices, since the value of SPAN, as entered by the user, 
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belongs to the interval (0.0, 1.0]. Since the computation of the neigh- 
borhood results in an integer value, different SPAN value entries may 
result in the same neighborhood size, e.g. if N=100, then SPAN values 
of 0.04 and 0.045 will both result in a neighborhood size of 4. 
Supersmoother uses two neighborhood sizes, IBW and IT, both integer 
values. The first (IBW+l) smoothed points and the last IBW smoothed 
points of the output Z array are computed differently than the central 
smoothed points. Most smoothers drop IBW points at the beginning and 
at the end of the smoothed Z array, e.g. the Moving Average smoother 
mentioned in Chapter I. IT is the number of points included in the 
local averaging. These IT points are the nearest neighbors of Xj, the 
abscissa corresponding to the smoothed point being computed. 
Supersmoother will always compute IT to be an odd integer. IBW, on 
the other hand, sometimes may be odd or even, depending on the value 
of N and SPAN. Since IT is odd, the Xj will be the median of the 
neighborhood with l (IT/2), integer division, points to the left and 
right. The following two equations are used in the computation of the 
neighborhoods, IBW and IT: 

IBW = (O.SxSPANxN) - 0.5 ; ' (2.12) 



IT = (2- IBW) - 1 



(2.13) 



The first IT values of the X and Z arrays are used to compute the 
base or initial values of an , Z mean , covariance of X and Z, and 
variance of X using equations 2.14 through 2.17: 



X 



mean 



IT 



j= i 

IT 



z 



mean 



£Zj 



j= i 



IT 



(2.14) 



(2.15) 
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cov xz - 



e[(Xj - X m .Jx(Zj 

J=J L 




rr 



VAR X = £(X a - X, 

j» 1 



(2.16) 

(2.17) 



The first (IBW+1) smooth Zj are computed using the results from equa- 
tions 2.14 through 2.17. The first step in the computation of these 
smooth Zj is to find the slope. A, of the least squares straight line 
through the set of points (X^, Z^), . . ., (Xj^, Zj.p) [Ref. 9: p. 5]. 
If VAR^<VSMLSQ, then A=0.0, otherwise equation 2.18 is used: 



A = 



COV 



xz 



VAR, 



(2.18) 



The second step is the actual computation of the smooth Zj using the 
slope. A, computed with equation 2.18, the results from equations 2.14 
and 2.15 and the following linear equation: 



smooth Zj — A x (Xj X mein ) — Z meu) 



(2.19) 



The cross- validated residual, ACVRj, are computed using the following 
procedure : 

1 . compute the portion of the neighborhood occupied by the 
smooth point, H, using equation 2.20: 



H = 



1.0 
IT ’ 



( 2 . 20 ) 



2. if VAR^>VSN1LSQ, then this large degree of variability inherent 
in the raw data must be reflected in H using equation 2.21: 



H - H * 



(X, - X mt J 2 
VARv 



( 2 . 21 ) 



3. finally the cross- validated residuals are computed with equation 

2 . 22 : 



ACVR, = 



ABS(Zj — smooth Zj) 
1.0 - H 



( 2 . 22 ) 
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Recall that only the first (IBW+1) smoothed Zj values have been 
computed. 



In order to compute the smoothed . . . , ^N-IBW* the 

neighborhood of points has to be moved from one point to the next 
point toward the right. This is where the dynamic programming 
computational procedures are useful in cutting the storage space and 
computer time. The results from equations 2.14 through 2.17 are 
updated to reflect the movement of the neighborhood, i.e. the left 
endpoint of the neighborhood will be dropped and the point to the right 
of the right endpoint will enter the neighborhood. Equations 2.23 
through 2.30 are used for each I, where 1= (IBW+1), . . ., (N-IBW). 

Equations 2.23 through 2.26 are used to drop a point from the 
neighborhood: 



X, 



(ITxX^J - X,_rr 
IT - 1 



(2.23) 



(IT » Z mttn ) - Z,_rr 
IT - 1 



(2.24) 



IT -* ( X|_ [-]■ — X metn ) > (Z]_rr ^mim) 



(2.25) 



COV xz = COV xz - 



IT - 1 



VAR X = VAR X 




(2.26) 



IT - 1 



Equations 2.27 through 2.30 are used to add a point to the 
neighborhood: 



X 



I (IT - -r X, 

IT 



(2.27) 
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I (IT - ijxZ^J + Zj 
IT 



(2.28) 



C0V X 7 = COVvz + 



ITx(X,~ X mtM ,)x(Z, - Z mttn ) 
IT - 1 



VAR-x = V ARx 



ITx (X, - X metn ) 2 
IT - 1 



(2.29) 

(2.30) 



The results from equations 2.27 through 2.30 are then used in equa- 
tions 2.18 through 2.22 to compute the smoothed Zj and the cross- 
validated residuals if necessary. 

The X mea , Z mean , and VAR.^ values used to compute 

smooth are used to compute the smooth Zj values where 

I=(N-IBW+1), . . . , N, i.e. the smooth values at the tail-end of the Z 
array. These mean and variance values are used in equations 2.18 

through 2.22 in the computation of the smooth Zj. This procedure is 
equivalent to the procedure used to compute the smooth Zj, where 1=1, 

. . ., (IBW+1) . 

D. MATHEMATICAL DETAILS- --SMOOTHING SUBROUTINE, 

SECONDARY USE 

The secondary use of the Smoothing Subroutine is to smooth data 
with abscissa values in the interval (0.0, 1.0]. The Smoothing 

Subroutine needs the same data and follows the same steps and equa- 
tions as if the abscissa were in the interval [1.0, 2.0, . . ., N] . The 
exceptions are noted in this section. 

When using equations 2.14 through 2.17, the first IBW points and 
the last (IBW+1) points of the X and Z data arrays are used to compute 
the initial X^^, ^mean' covariance of X and Z, and the variance of X 
values. Equations 2.14 through 2.17 are then changed in order to 
allow these new points to be involved in the computations. Equations 
2.31 through 2.34, shown below, are the result of the change and are 
used in the computation of the initial values of X^ , Z mean , covari- 
ance of X and Z and variance of X: 



X 



mean 



E 



J= N- EB W ~i 

IBW - 1 



EBW 

E x j 



i 



IBW 



(2.31) 
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mean 



(2.32) 



Z 



ID W 

S Zj E 

J=N-mw.i j.j 

IBW -r 1 ~ IBW 



N IBW 

COV xz = 2 (Xj-X m , 10 )x(Z J -Z m , 4I1 )-E(X J -X intul )x(Z J -Z ni< J ; (2.33) 

J=N-fflW»l J=1 



VAR X 



E (x. 



J»N-EBW-rl 



IBW 

X m ,* n )' “ E (Xj - x mt4 j 

J-! 



(2.34) 



The next step in this smoothing procedure is to drop a point from 
the neighborhood. In the previous Section, equations 2.23 through 
2.26 were used for this task, but they cannot be used in this section 
because the input point counter indicates that negative index values are 
computed at the beginning of the computations. The negative indices 
are the result of the last (IBW+1) point being being used in equations 
2.31 through 2.34. Thus to keep the point counter on track let 
K=N+I-IBW-1 and change equations 2.23 through 2.26 as indicated in 
equations 2.35 through 2.38, respectively. Then in order to drop a 
point from the neighborhood, equations 2.35 through 2.38 are used: 



X 



mean 



(HVX^J - X K - 1.0 
IT -1 



(ITxZ„ 



— Zv — 1-0 



IT - 1 



(2.35) 



(2.36) 



cov xz = cov xz - 



ITx (X K - 1.0- X ntMI )x(ZK - Z nM ) 
IT -1 



(2.37) 



VAR X 



VAR X 



ITx(X k - 1.0- X mt J* 
IT -1 



(2.38) 
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Since a point was dropped from the neighborhood the next adjacent 
point on the right boundary of the neighborhood must be entered into 
the neighborhood. In the previous Section, equations 2.27 through 
2.30 performed this task, but in order to keep the point counter on 
track these equations must be modified. Therefore, let K=I+IBW and 
the new equations are equations 2.39 through 2.42. Thus to add the 
next point to the neighborhood, these new equations are used: 



X„ 



[(IT - l)xX. tlI ] - X K 



IT 



[(IT - l)xZ mein J Z K 



IT 



(2.39) 



(2.40) 



COV xz = COV xz - 



IT x (X K - X wt Jx(Z K 
IT - 1 



(2.41) 



VAR X = VAR X 



IT x (X K - X mean ) 2 
IT - 1 



(2.42) 



The results produced by equations 2.39 through 2.42 are then used 
in equations 2.18 through 2.22 to compute the smooth Zj, where 1=1, . 

. ., (IBW+1) . 

In order to compute the middle smooth Zj values, i.e. smooth Zj 
where I=(IBW+2), . . ., (N-IBW), equations 2.23 through 2.30 and 
equations 2.18 through 2.22 are used as they are, i.e. no changes 
involved. 

The computation of the last (IBW-1) smooth Zj values, i.e. smooth 
Zj where I=(N-IBW+1), . . . , N, involves changing equations 2.23 

through 2.26 a second time, in order to maintain the point counter on 
track. This change is needed because the first (IBW-1) input points 
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are used to compute these smooth Zj and the point counter must not 
exceed N, the number of points to be smoothed. Thus let K=I+IBW-N 
and the result of the change is shown in equations 2.43 through 2.46. 
Equations 2.44 through 2.46 are used to drop a point from the 
neighborhood: 



X 



mean 



(ITxX m ,, n ) - X K + 1-0 

IT - J ’ 



(2.43) 



Z 



mean 



(IT*Z m „J - Z K - 1-0 

IT - 1 1 



(2.44) 



COV xz - CO\ xz 



ITx(Xk — 1-0 — X mt Jx(ZK — Z me4n ) 
IT - 1 



(2.45) 



VAR X = VAR X 



< ITx (X K + 1.0 - X mttD ) 2 
IT - 1 



(2.46) 



In order to replace the point dropped from the neighborhood, equa- 
tions 2.27 through 2.30 are used, but in a different form because of 
the same reason that equations 2.23 through 2.26 were changed above. 
Therefore, with K=I-IBW-1 the equations used to add a point to the 
neighborhood are equations 2.47 through 2.50: 



X 



main 



[(IT - l)xX ratao ] 

It 




(2.47) 



z 



main 



[(IT - l)xZ m _] - Z K 

It 



(2.48) 



cov xz = cov xz 



ITx(X k - X mecD )x(ZK - z mt j 
IT - 1 



(2.49) 
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VAR X = VAR X - 



ITx(X k - X m „ n ) 2 
IT - I 



(2.50) 



The results produced by equations 2.47 through 2.50 are then used 
in equations 2.18 through 2.22 in order to compute the smooth Zj where 
I=(N-IBW+1) , . . . , N. The disadvantage of using abscissa values 

from the interval (0.0, 1.0] versus abscissa from the interval [1.0, 

2.0, . . . , N] is that a slight degree of distortion is produced at the 

ends of the smoothed Z array. The distortion could be caused by the 
1.0 adjustment factor in equations 2.35, 2.36, 2.43, and 2.44. 

E. SELECTION OF SPAN 

The span value is the parameter that controls the smoothing of a 
data set. There exist no set procedures for selecting a span value. 

Each data analyst has his/her own method of selecting the span value. 

The analyst's experience with smoothers determines how the span is 
selected. Selection of the span value is basically a subjective process, 
where the analyst uses a span value which gives adequate and useful 
results. The user of the advanced smoothers should develop a consis- 
tent, span selection process. A common procedure used by some expert 
smoothers starts by looking at a scatterplot of the raw data. Then the 
analyst looks for periodicity and cyclic changes present in the data. 
This information is then used to estimate the span value to be used in 
the smoothing. For example, if a data set displays a cycle of about 24 
points, then the span to use should be about 24/N, where N is number 
of points to be smoothed. This span value is a good estimate because 
the raw data is permitted to determine the shape of the smooth results. 
This procedure is used in Chapter V of this thesis in the smoothing of 
a large set of sea-surface temperatures. 

Supersmoother is unique among smoothing algorithms in that three 
span values, i.e. SPANp SPANg, and SPANg, may be entered by the 
user. Supersmoother will then select an optimal span value within the 
range of the smallest span value and the largest span value by using 
the method of cross-validation which was explained earlier in this 
Chapter. This option within Supersmoother lets the user be very 
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flexible in the selection of the span values to use. But the user must 
be careful about which span values to use with Supersmoother. It is 
best to first try span values of 0.05, 0.2, and 0.5, as recommended by 
Friedman and Stuetzle [Ref. 9: p. 9], This range of span values gives 
Supersmoother good coverage of the data. After viewing the results 
produced by Supersmoother, the user can adjust the span values in 
order to get the desired smooth effect. When adjusting, the user must 
bear in mind the bias/variance trade-off discuss earlier in this chapter. 
The trade-off being that if the span value is increased, then result is a 
smoother looking curve, while the reverse occurs when decreasing the 
span value. 

No matter what rule is followed to determine the span values used in 
Supersmoother, the final smooth results accepted are based on subjec- 
tive needs, applications, and preferences. 
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III. TECHNICAL DESCRIPTION OF SPLIT LINEAR FIT ALGORITHM 



A. OVERVIEW 

The Split Linear Fit smoothing algorithm was developed at Stanford 
University by John A. McDonald and Art B. Owen. [Ref. 10]. The 
Split Linear Fit smoother produces piece-wise smooth curves and thus 
will depict discontinuities present in the input data [Ref. 10: p. 1] . 
Most smoothers tend to distort discontinuities because the weighted 
averaging technique used to compute a smoothed point requires a 
continuous underlying function. The Split Linear Fit smoother will not 
distort the smooth curve at discontinuous points and does a very good 
job of detecting sharp slopes in the input data. This is the reason the 
Split Linear Fit algorithm is sometimes classified as an edge-detecting 
smoother [Ref. 10: p. 2]. 

The Split Linear Fit smoother is similar to Friedman and Stuetzle's 
Supersmoother in several ways: 

1. every input point receives a respective smooth point; 

2. the user can enter more than one neighborhood size; in this 
algorithm the neighborhood sizes are called window sizes, where 
window of size K is defined as "a set of K successive point" 
[Ref. 10: p. 2], (window size and span are equivalent terms); 

3. the window is shifted to the right by dropping the left 
endpoint and then adding the point adjacent to the right 
endpoint; 

4. the method of least squares is used to estimate a straight line 
through the points within the shifting window; 

5. the Split Linear Fit smoother is scale independent. 

A major difference between the Split Linear Fit smoother and 
Supersmoother is the method used to combine the linear fitted values. 
Another difference is that the Split linear Fit smoother does only robust 
smoothing . 
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The objective of the Split Linear Fit smoother is to produce a piece- 
wise smoothed curve with minimal discontinuous features [Ref. 10: p. 
2]. Figure 3.1 shows the Split Linear Fit smoothing algorithm as 
composed of three subroutines: 

1. the Regression Subroutine; 

2. the Weighting Subroutine; 

3. the Combining Subroutine. 




Figure 3.1 Data Flow in Split Linear Fit Smoother. 

Figure 3.1 also shows that the Split Linear Fit smoother uses the itera- 
tive process once on its output. This is done in order to decrease the 
variance in the first set of output, since, as mentioned before, the 
Split Linear Fit smoother only does robust smoothing. If the first set 
of output were to be plotted, the curve would appear very jagged. 
Passing the first set of output through the Split Linear Fit algorithm, 
decreases the robustness and variability of the final output. 
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The number of window sizes entered by the user dictates the 
number of times that the input data is passed through the Regression 
Subroutine to produce a family of linear fitted values and residual 
values associated with each I, 1=1, . . N, where N is the number of 
points to be smoothed, see Figure 3.2. Each family of linear fits may 
be viewed as a pseudo -distribution of linearly fitted estimated values of 
an input point value. 




Figure 3.2 Regression Subroutine in Generalized Form. 
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The Weighting Subroutine receives the N families of linear fits and 
mean squared residual values from the Regression Subroutine and finds 
the minimum mean squared residual value within each family of mean 
squared residual values. Not all the mean squared residual values 
qualify as candidates for the minimum mean squared residual values. 
An acceptable mean squared residual value is one that does not exceed 
an established cutoff value. The cutoff value used in the Split Linear 
Fit algorithm is the value -1.0x10 . This value was selected because 

it provided a better smooth curve at discontinuities inherent in the 
raw, input data than other cutoff values [Ref. 10: p. 3]. The minimum 
mean squared residual value is used as a base to compute a weight 
corresponding to each of the acceptable mean squared residual values 
within the family, see Figure 3.3. These weights are used by the 
Combining Subroutine in computing the smooth point values. The 
weights "depend on a measure of the quality of the corresponding linear 
fits" [Ref. 10: p. 2]. Quality meaning that the smaller the mean 

squared residual value the higher the weight assigned to the corre- 
sponding fitted value. The weight assigned to a fitted value is a func- 
tion of the following: 

1. the corresponding mean squared residual; 

2. the minimum mean squared residual, and; 

3. the average of the acceptable mean squared residuals within the 
associated window. 

This weighting procedure is used in order to smoothly integrate discon- 
tinuities in the input data with the other smooth points. This proce- 
dure is the edge-detector and is the cause of the robust smoothing. 

The smooth point value at I is a weighted average of the linear fits 
in the family of linear fits corresponding to I. The Combining 
Subroutine combines the weights produced by the Weighting Subroutine 
and the fitted values produced by the Regression Subroutine associated 
with I and computes the respective smooth point value, see Figure 3.4. 

As mentioned before the first set of smooth point values produced 
by the Combining Subroutine is itself passed through the Split Linear 
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Figure 3.3 Weighting Subroutine in Generalized Form. 



Fit algorithm again in order to reduce the variability. The second set 
of smooth point values is the output generated to the user, see Figure 
3.1. 

B. MATHEMATICAL DETAILS — REGRESSION SUBROUTINE 

The Regression Subroutine requires the following user input: 

1. N, the number of points to be smoothed; 

2. Yj, . . ., the point values to be smoothed (in chronolo- 

gical order) ; 
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Figure 3.4 Combining Subroutine in Generalized Form. 

3. X-p . . ., Xjq-, the abscissa corresponding to the Y values (in 
ascending order since the Yj are in chronological order); 

4. NTRYS, the number of window sizes to be used; 

5. WNSZp . . . , WNSZj^<j>j^y 2 , the values of the window sizes; 

S. MNWNSZ, the minimum window size permitted by the user. 

The minimum window size, MNWNSZ, is the lower bound set on the 
window size. The value of the lower bound should be at most one-half 
the value of the smallest window that will be used in the smoothing. If 
MNWNSZ is any larger then some smooth points will be dropped from the 
ends of the output array, or a plot of the smooth point values will show 
distortion at the ends. 



45 



Figure 3.2 shows the Regression Subroutine using a procedure 
resembling the iterative process, but each pass of the Yj values 
through the Regression Subroutine uses a different window size, and 
some variables are reset to zero. The purpose of the Regression 

Subroutine is to compute a family of fitted values and a family of mean 
squared residual values corresponding to each I, 1= 1, . . ., N. The 

Regression Subroutine is shown in more detail in Figure 3.5. The 
Regression Subroutine can be divided into three parts: 

1. definition of zero and computation of sum of first (MNWNSZ-1) 
values and computation of fitted values for 1=1, . . ., 
(MNWNSZ-1); 

2. shifting of window and computation of fitted values and mean 
squared residual values for I=MNWNSZ, . . ., (N-MNWNSZ + 1) ; 

3. computation of fitted values for 1= (N-MNWNSZ+2) , . . ., N. 

The variable EPS is used to define zero for computational purposes. 
The interquartile range of the abscissa array is used in the computation 
of EPS, as shown by equations 3.1 through 3.3: 



• 1^ 

II 


(3.1) 


JR = 3x JL ; 


(3.2) 


EPS = X JR - X JL . 


(3.3) 



If EPS<0.0 and JR<N, then EPS is recomputed using the following three 
rules: 

1. if JR<N, then JR is increased by a value of one; 

2. if JL>1, then JL is decreased by a value of one; 

3. EPS is recomputed using the new values of JR and JL and 
equation 3.3. 

EPS will be equal to zero only if N<3 and if the computer allows index 
values equal to zero, otherwise a computer error will result. If this 
situation occurs then items 1 and 3 from above will apply. Since the Xj 
values are in ascending order, EPS will have a value greater than zero 
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after the first recomputation of JR. After EPS has been defined as 
being greater than zero, it is adjusted using equation 3.4 in order to 
define zero for computational purposes: 



EPS = (EPS x l.OxlCT 10 ) 2 



(3.4) 



In order to fit a linear model on the window endpoints and central 



point, slope, and y-intercept values of the model must first be 
computed. The parameter MNWNSZ dictates which input point values 
will be used to compute the initial slope and the corresponding 
y-intercept value. The first MNWNSZ values of the input data are used 
to compute these necessary values. The first (MNWNSZ-1) values of 
the input data are used in equations 3.5 through 3.10 to compute the 
basic Siam values to be used later and increment a counter which keeps 
track of the input points: 



MNWNSZ- 1 



SUM X = £ x, ; 



(3.5) 



MNWNSZ-1 



SUMy = £ Y, ; 



(3.6) 



i- 1 



KOUNTER = MNWNSZ - 1 ; 



(3.7) 



MNWNSZ-1 



SUM X s Q - 




(3.8) 



!=■ 1 



MNWNSZ-1 



SUMysp - £ Yj J ; 



(3.9) 



i 
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Figure 3.5 Data Flow in the Regression Subroutine. 
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SUMxy = 



(3.10) 



mnwnsz-i 

E (X.xY.) . 



i»i 



For each 1=1, . . ., (MNWNSZ-1), the right endpoint fitted value and 
the mean squared residual value are set to the value of -1.0x10 . 

This is done so that the Weighting Subroutine will assign a zero weight 
to the left endpoint fitted values and the mean squared residual values 
corresponding to these I. Therefore, the smooth values corresponding 
to these I values are computed using a smaller window [Ref. 10: p. 3]. 
This procedure does not use the window concept that is mentioned 
below. This is done to avoid computer errors since most of the points 
in the window corresponding to these I will have negative index values. 

Figure 3.5 shows the iterative process that is used to compute the 
family of fitted values and the family of mean squared residual values 
corresponding to I, I=MNWNSZ, . . ., (N-MNWNSZ+1) . The first step 

determines the window central point and the endpoints that correspond 
to I. For ease of understanding, let K be the number of successive 
points in the window. If K is odd then the central point is equivalent 
to the median of the window and has l (K/2) neighboring points to the 
left and right of it. If K is an even number, then the central point is 
will have [(K/2)-l] neighboring points to the left and (K/2) neigh- 
boring points to the right, thus the central point is the point to left of 
the window median. The index of the right endpoint will always be 
equal to I. If K is odd, the index of the central point will be equal to 
[I- L(K/2)], and the index of the left endpoint will be equal to 

(I-K+l). If the value of K is even, the index of the central point will 
be equal to [I - (K/2 ) ] , and the index of the left endpoint will be equal 
to (I-K+l) . Point values that have corresponding index values that are 
negative or zero are not included in the linear fit. 

The next step adds the I** 1 point to the window and uses the 
method of least squares to estimate the straight line through the points 
in the window. The procedure for adding the I** 1 point to the neigh- 
borhood adds the values produced by equations 3.5 through 3.10 to the 
Xj and the Yj values using equations 3.11 through 3.16: 
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SUM* - SUM X + X! : 


(3.11) 


SUMy = SUM y Y, ; 


/—• s 
CM 
tH 

CO 


KOUNTER = KOUNTER + 1 ; 


(3.13) 


SUMxsq = SUMxsq - Xf ; 


(3.14) 


SUMysq = SUMysq + Yj* ; 


(3.15) 


l 

SUMxy = SUMxy + (XjxY,) . 


(3.16) 



Next the mean of the sum values computed by equations 3.11 through 



3.16 is computed using the value of KOUNTER as 
equations 3.17 through 3.21: 


the denominator 


SUM X 

MEAN X - K0UNTER ■ 


(3.17) 


SUMy 

MEANy - K0UNTER • 


(3.18) 


SUMxsq 

MEANxsq - K0UNXER ' 


(3.19) 


SUMysq 

MEANysq - KOUNTER ’ 


(3.20) 


SUMxy 

MEAN'xy = ■ 

XY KOUNTER 


(3.21) 



The variance of the abscissa is derived by equation 3.22: 

XVAR = MEAN xsq - MEAN* . (3.22) 

The method of least squares is used to compute the slope and the 
y- intercept of the straight line fitted to the points in the window. The 
results produced by equations 3.17 through 3.18 are used to compute 
the coefficients of the straight line that is fitted to the points within 
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the window. If XVAR^O.O, then the slope of the straight line, SLOPE, 
is zero, i.e. SLOPE = 0.0, otherwise the value of SLOPE is computed 
with equation 3.23: 



SLOPE = 



MEANxv 



(MEANx xMEANy) 
XVAR 



(3.23) 



The y-intercept of the straight line is computed using equation 3.24: 

INTER = MEANy - {SLOPE x MEAN X ) . (3.24) 

The mean squared residual value about the fitted line is computed with 
equations 3.25 through 3.28: 



MEAN RS q = A + B + C j 



(3.25) 



where 

A = MEANysq - ( 2x INTER xMEANy) - (2xSLOPExMEAN XY ) ; (3.26) 

B = INTER 2 -+ (2xINTERxSLOPExMEAN x ) ; (3.27) 



C = MEAN xaq x SLOPE 2 . (3.28) 

The window central point and endpoints are fitted to a linear model 
using the slope and the y-intercept value computed above to produce 
the fitted value FITjyp, where 1= current I 111 value, W= current window 
size, and P= left endpoint, central point, or right endpoint. The mean 
squared residual value, MSQRj^p, is computed using the computed local 
linear fit coefficients, the Xj and Yj values, and the counter value in 
equations 3.29 through 3.31: 

FITjwp = INTER t (SLOPExXj) ; (3.29) 

RES = Y, - FIT,wp ; (3.30) 
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MSQRjwp 



(. KOUNTER x.MEANrsq) - RES 2 
KOUNTER - 1 



(3.31) 



After FITjyp and MSQRj^p have been computed for the I** 1 

window's central point and endpoints, the window must be shifted to 
the right. The Supersmoother algorithm uses the same procedure as 
the Split Linear Fit, i.e. dropping the left endpoint from the sum 

values of equations 3.11 through 3.16 and then adding the entering 

point to these same equations. Let IL be the index of the left 

endpoint, then this point is dropped using equations 3.32 through 3.37: 



SUM X = SUM X - X a ; 



(3.32) 



SUM y = SUM y - Y a ; (3.33) 

/ 

KOUNTER = KOUNTER - 1 ; (3.34) 

SUMxsq = SUM xsq - Xi ; (3.35) 

SUM YS q = SUMysQ - Y a ; (3.36) 

SUMxy - SUM X y - (X a x Y a ) . (3.37) 



Next, I is incremented by one and, if I<(N-MNWNSZ + 1) , equations 3.11 
through 3.16 are used to enter the new I^ 1 point into window. 
Equations 3.17 through 3.37 are then repeated using the new values. 
This procedure is continued until I>(N-MNWNSZ + 1) . 

When I>(N-MNWNSZ + 1) , the left endpoint FITj^/p and MSQRjwp 
values corresponding to the values of I are set equal to -1.0x10^®, so 
that these fitted values are assigned no weight in the Weighting 
Subroutine. This procedure was used for 1=1, . . ., (MNWNSZ-1) at 
the beginning of the Regression Subroutine. 

If the user entered more than one window size, then the input data 
is passed through the Regression Subroutine with the next window size. 
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The sums and counter of equations 3.5 through 3.10 are initialized to 
zero before repeating the Regression Subroutine. After the last window 
size has been used, each window will have contributed a total of six 
values to each I, i.e. three linear fitted values and three mean squared 
residual values . 

C. MATHEMATICAL DETAILS- --WEIGHTING SUBROUTINE 

The objective of the Weighting Subroutine is to compute a weight 
which is indicative of the degree of goodness of fit of each linear fitted 
value, FITj^p, with respect to a line with slope equal to one. Figure 
3.3 shows the procedure followed by the Weighting Subroutine. As 
noted in the figure, each family of mean squared residual values is 
used one set at a time. The following data is transferred from the 
linear Regression Subroutine: 

1. N,the number of points to be smoothed; 

2. NTRYS,the number of windows used in the smoothing; 

3. N families of mean squared residual values, MSQRj^p; 

4. N families of fitted values, FITjyp. 

The Weighting Subroutine is executed once for each family of mean 
squared residual values. The Regression Subroutine produced a family 
of (3xNTRYS) mean squared residual values corresponding to each I. 
For computational feasibility a lower bound of -1.0x10 is set on the 
values of mean squared residual, i.e. the MSQRj^p. Each MSQRjyp 
value is compared against -1.0x10 , and the MSQRj^p values less than 

or equal to -1.0x10 are marked as unacceptable. These are not 
considered in the search for the minimum MSQRjyp within the I** 1 family 
MSQRj^p- The minimum MSQRp^p corresponding to I is found by doing 
a comparison between the acceptable MSQRjyp values in the I L family 
of MSQRj^p. The expressions listed below are used by the Weighting 
Subroutine on each family of MSQRjyp: 

i.L 

1. MIN is the minimum MSQRj^p in I in family; 

2. LAMBDA is the sum of MSQRp^p greater than -l.OxlO 3 ^*; 

3. LAMBDA is divided by the number of MSQRjyp greater than 
-l.OxlO 30 ; 
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4. LAMBDA is then reduced by the value of MIN. 

If the value of LAMBDA is less than or equal to zero, then LAMBDA is 
not modified, otherwise LAMBDA is recomputed using equation 3.38: 



LAMBA = 



1.0 

LAMBA 



(3.38) 



If the value of MIN is positive, MIN is not changed, otherwise it is 
made a positive value using equation 3.39: 



MIN = lx 10' 10 . 



(3.39) 



The last step in the Weighting Subroutine is to compute a family of 
weights which indicate the goodness of fit of the linear fitted values, 
FITiwp, using the corresponding family of MSQRj^p values. The 
MSQRjyp values which are less than -1.0x10 cause a weight of zero 

to be attached to the corresponding fitted value, FITjyp. The reason 
for this occurring is that these values are considered unacceptable 
based on the established cutoff value discussed in Section A of this 
chapter. If the MSQRjyp value satisfies the cutoff rule, then a weight 
will be computed indicating the goodness of fit of the corresponding 
FITiwp. The weight is a function of the quality of the corresponding 
MSQRjyp value. TEMP indicates the degree of quality and is computed 
using equation 3.40: 

TEMP = LAMBA - (MSQR IWP - MIN) . (3.40) 

Recall that the smaller the value of MSQRj^p, the better the value 
of FITjyp. In other words, small values of TEMP indicate a good 
FITiwp value, therefore, these fit values receive high weights. Three 
conditions are used in assigning a weight that to each acceptable 

MSQRjyp: 

1. if TEMP<0. 0, then WT Iwp = 1.0; 

2. if 0.0<TEMP<1.0, then the weight is computed using equation 
3.41: 
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WT IWP - (1.0 - TEMP) 2 ; 



(3.41) 



3. if TEMP>1.0, then WT Iwp = 0.0. 

According to the Split Linear Fit developer, equation 3.41 results in 
"smooth transitions between zero weights and small non-zero weights" 
while other functions tend not to have the desired effect [Ref. 10: p. 
3] . Recall that the Weighting Subroutine is repeated for each family of 
mean squared residual values and that the output is a family of weights 
corresponding to each I. 

D. MATHEMATICAL DETAILS — COMBINING SUBROUTINE 

The objective of the' Combining Subroutine is to compute the smooth 
point values. The following data is transferred from the Weighting 
Subroutine : 

1. N, the number of points to be smoothed; 

2. NTRYS, the number of bandwidths used in the smoothing; 

3. N families of fitted values, FITj-^p; 

4. N families of weights, WTj-^p. 

Before using a FITj^p value in the following computations, it is 
compared to the value -1.0x10 . FITjyp values less than or equal to 

-1.0x10 are marked as unacceptable and not used in the computations 
of the corresponding smooth value. Using each family of WTjyp and 
FITiwp, the Combining Subroutine computes a weighted average of the 
linear fitted values, FITjyp. The first step in computing the I** 1 
smooth point is to use the corresponding family of FITjyp and WTjyp 
values in equations 3.42 and 3.43: 

RSUM = for J= 1.2....N : (3.42) 

v> p 

WSUM = Y l Ti WT iwp< for 1=1,2, ...N ; (3.43) 

W p 

If WSUM>1.0xl0~^, then the I*“ smooth point is produced by equation 
3.44, otherwise SMOOTHj equals -1.0x10^®. 
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(3.44) 



SMOOTH, = 



RSUM 

SUM 



Next, I is incremented by one, and the Combining Subroutine is 
repeated using the new value of I and the corresponding families of 
FITjyp and WT^p values. This procedure continues until I>N. If 
these smoothed point values were plotted, the plot would show too many 
peaks and would appear very jagged. In order to alleviate this 
problem, these smoothed point values are used as input data to the 
Regression Subroutine, and the Split Linear Fit algorithm is executed 
one more time. The output of the second pass does not have as much 
variability as the output from the first pass and is thus more useable 
for data reduction. 



E. SELECTION OF WINDOW SIZE 



The section on selection of span value in the previous chapter 
applies in this chapter to a great degree. Window size and span value 
are equivalent terms, and both affect the smoothing to a great degree. 
Since both the Supersmoother and the Split Linear Fit use local linear 
regression, the relationship between a window size and the degree of 
smoothing can be explained by equation 1.4. The effect on the residual 
values caused by varying the window size in equation 1.4 must be kept 
in mind when selecting a window size, i.e. remember the following: 

1. large window sizes produce a smooth plot; 

2. small window sizes produce a not so smooth plot. 

What equation 1.4 is illustrating is that the degree of smoothness is the 
result of a tradeoff between bias and variance in the resulting smooth 
plot since it is an estimation of a function in the presence of additive 
errors [Ref. 10: p. 1]. A satisfactory trade-off between bias and 

variance is difficult to obtain. Better decisions can be made by looking 
at a plot of the smooth output. Therefore, the user of a smoothing 
program should plot the smooth output and then decide if the results 
satisfy his/her needs and desires, otherwise the smoothing program is 
run again with a different window size. Some people want and need 
very 'smooth 1 and highly biased results while others want results on the 
other extreme, i.e low bias and high variance. 
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The above paragraph explains in general the effects produced by 
using one window size, but the Split Linear Fit smoother can accept 
more than one window size. By using more than one window size the 
desired smooth output may be found in less time than if a single window 
size had been used. The reason for this speed is that the Split Linear 
Fit algorithm has more information about the shape of the raw data, 
than when only one window size is used, i.e. a pseudo- distribution of 
fitted values about each raw point is produced. Then the smoothed 
point is computed using this additional information, i.e. a weighted- 
average of the fitted values. But the user needs to apply the basic 
relationship between window size and degree of smoothness stated in the 
above paragraph in selecting a set of window sizes, i.e. a set of large 1 
windows will produce a smoother effect than a set of * small 1 windows. 

According to McDonald and Owen it is best to use a set of three to 
five consecutive odd window sizes [Ref. 10: pp. 2-4]. A mixture of 
small and large window sizes will result in centrally smooth point values 
with a slight degree of variability. In order to be able to accurately 
trace the curvature of the input data, it is best to do the following: 

1. roughly measure the periodicity of the input data; 

2. use this value as one of the window sizes to be used in the 
smoothing; 

3. select the other window sizes with respect to the value of the 
periodicity . 

For example, if the periodicity of the input data is estimated to be 27, 
then 27 is used as an input window size and the other window sizes 
may be 23, 25, 29, and 31. Or the periodicity value may be either the 
smallest window size or the largest window size while the other window 
sizes are selected with respect to the periodicity. 

The other factor that has great influence on the smooth output 
produced by the Split Linear Fit is the minimum window size, MNWNSZ. 
If this value is too large then the smooth output will not be what is 
expected. It is best to keep the value of MNWNSZ at no more than 
one-half the value of the smallest window. This subject will be 
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discussed in more detail in the evaluation of the Split Linear Fit 
smoother. 
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IV. EVALUATION OF THE ADVANCED SMOOTHING ALGORITHMS 



A. GENERAL 

As stated in Chapter I, a smoothing algorithm may be compared to a 
low-pass filter which is designed to do the following: 

1. filter out "noise" from a data set and; 

2. estimate the underlying functional relationship present in the 
given data set. 

Before proposing a 'good and efficient' smoothing algorithm to an indi- 
vidual, the user must be shown that the 'good and efficient' smoothing 
algorithm is robust. In other words, it is necessary to illustrate that 
the smoothing algorithm performs well with most data sets, whether the 
underlying function is either of the following: 

1. simple like a linear function or a simple trigonometric function, 
or; 

2. complex and is very difficult to define mathematically. 

In this chapter, the input data sets used with the Supersmoother 
and the Split Linear Fit smoothing algorithms are generated from simple 
known functions with Normal (0,1) "noise" added. The GRAFSTAT 
[Ref. 12] functions used to generate the pseudo-random Normal (0,1) 
deviates are given in Section 1 of Appendix C. The output produced 
by these algorithms is evaluated in order to do the following: 

1. observe how well the Normal (0,1) "noise" is filtered by the 
smoother, and; 

2. determine how well the true function is extracted and depicted. 

In this chapter the input data sets have a constant variance of 1. 
In the next chapter the input data set dose not necessarily have a 
constant variance, because it is real, and the unknown underlying 
function is probably too complex to define succinctly. 
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B. METHODOLOGY 



There is no established procedure to follow in testing the efficiency 
and effectiveness of a smoothing technique/algorithm. Since the 
adequacy or usefulness of he smooth output is largely based on the 
user's subjective judgement of the shape of the smooth curve, i.e. a 
plot of the smooth point values, the most effective method is to compare 
the output produced by the new algorithm to the output produced by 
previously validated, widely used, and well liked smoothing algorithms, 
e.g. LOWESS [Ref. 3: pp. 94-104]. The following procedure is used to 
evaluate the Supersmoother and the Split Linear Fit smoothing 
algorithms : 

1. explain and display the data set to be smoothed; 

2. display and explain the smooth results produced by the base- 
line smoothing techniques/algorithms, i.e. Least Squares 
Regression, Equal- Weight Moving Average, Cosine -Weighted 
Moving Average, and LOWESS; 

3. display and examine the smooth results produced by the 
advanced smoothers; 

4. compare these results to the previously discussed results from 
2 above. 

The GRAFSTAT graphics package [Ref. 12] was used to generate all 
of the graphs in this thesis. GRAFSTAT was also used to do the Least 
Squares Regression and the Equal-Weight Moving Average smoothing. 

The Method of Least Squares tries to standardize the curves that 
can be fitted to a data set. The measure of performance that is used 
with this global fitting technique is the sum of squared residuals . The 
Method of Least Squares produces a smooth curve which closely approx- 
imates the given set of data points and which minimizes the sum of 
squared residuals attainable with the chosen global curve. For more 
explicit details see Spiegel [Ref. 13: pp. 258-305]. GRAFSTAT lets the 
user select the type of curve that should be fitted to the given data 
set. The following listed curve fits which use the Method of Least 
Squares were used in this thesis and are available in the GRAFSTAT 
graphics package: 
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1. linear curve fit; 

2. quadratic curve fit; 

3. third degree polynomial curve fit, and; 

4. Spline fit. 

Least Squares Regression with linear fit is a technique of finding the 
linear equation which fits a data set and minimizes the sum of squared 
residuals. Least Squares Regression with quadratic fit does the same, 
but the data is fitted to an second degree polynomial equation, i.e. Y = 
AX + BX + C, where A, B, and C are the estimated coefficients, X is 
the abscissa corresponding to the data being smoothed, and Y is an 
estimate of the data being smoothed. Least Squares Regression with 
third degree polynomial curve fit is also basically the same as previous 
two techniques, but the equation being fitted to the given data has the 
form of Y = AX^' + BX^ + CX + D, where A, B, C, and D are the esti- 
mated coefficients and X and Y are the same as for the quadratic fit. 
For more details about Least Squares Regression with either linear fit, 
quadratic fit, or third degree polynomial fit see Spiegel [Ref. 13: pp. 
258-305]. All of these techniques use global curve fitting, i.e. the 
curve is fitted to the given data as an entity. The Spline fit on the 
other hand uses local curve fitting in order to produce the smoothest 
possible curve with the sum of squared residuals value less than or 
equal to a parameter entered by the user [Ref. 12]. The Spline curve 
fitting technique uses the Least Squares Method with third degree poly- 
nomial within a predetermined neighborhood of the given data. The 
neighborhood size is predetermined by the developers of this 
GRAFSTAT function. The second derivative of the defined cubic equa- 
tion is computed and evaluated using the median of the neighborhood of 
point values. The neighborhood is then shifted to the next point and 
the procedure is repeated. The sum of squared residuals is computed 
and compared to the maximum sum of squared residuals value that the 
user requires. If this value exceeds the users constraint then the 
entire procedure is repeated. In other words, the Spline curve fitting 
technique is a constrained linear programming problem, where the 
constraint is the user's maximum sum of squared residuals value 
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desired, and the objective function is a cubic equation [Ref. 16: pp. 
77-87]. The user of these curve fitting technique should first look at 
a scatterplot of the raw data and then decide which one to use. 
Therefore, if a scatterplot of a data set looks linear, the user should 
attempt to fit a linear curve to the data, otherwise, one of the other 
curve fitting techniques should be used. 

The Equal-Weight Moving Average smoother was briefly discussed in 
Chapter I during the discussion of equations 1.3 and 1.4. The 

Equal-Weight Moving Average GRAFSTAT functions that were used to 
generate the smooth point values are shown in Appendix C. The 

following equation defines the smooth points produced by the 
Equal- Weight Moving Average smoother: 



w’here N is the number of points to be smoothed and K is the neighbor- 
hood size, i.e. the number of points involved in the averaging. Both 
N and K must be positive, non- zero integers, w’ith K being odd. For 
an expansion of the Equal- Weight Moving Average smoothing theory, see 
Anscombe [Ref. 14: pp. 153-159]. 

The Cosine Weighted Moving Average smoother is an extension of 
the Equal-Weight Moving Average smoother. Instead of using the 
inverse of K as the weight for each Y value within the neighborhood, a 
cosine related weight is computed for each of the Y values within the 
neighborhood of size K. (The APL functions used to generate these 
values appear in Appendix C.) These cosine weights are a function of 
the Y values' location within the shifting window/neighborhood of size 
K. The expression defining the smoothed output of the 
Cosine-Weighted Moving Average smoother is: 




(4.1) 




(4.2) 



w'here 
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Anscombe [Ref. 14: p. 450] characterizes this smoother as a "better- 
quality approximation" than other Moving Average smoothers, because 
"not only does the integrand, but also its first derivative vanish at 
both ends of the range of integration" [Ref. 14: pp. 156-157]. 

The last smoother used as a base against which the advanced 

smoothers were tested is the Locally Weighted Regression Scatter Plot 
Smoother, commonly called LOWESS [Ref. 3: pp. 94-104]. LOWESS uses 
the smoothing technique referred to as local regression, i.e. the 
Method of Least Squares is used on the input points within a user 

given neighborhood. Only one neighborhood size is used by LOWESS 
per run of the program. The program requires that the user not enter 
the neighborhood size to be used, but a parameter called F, which is a 
ratio of the neighborhood size to the number of points to be smoothed. 
The user has the option of fitting either a linear or a quadratic func- 
tion to the point values within the neighborhood. In addition, the user 
has the option of using robust or non-robust smoothing. Robust 
smoothing has more variability than non-robust smoothing, because 
outliers are emphasized. Each input point within the neighborhood 
receives a weight which is a function of the point's location with 

respect to the median of the neighborhood. These weighted point 
values are then used to define a fitted curve within the neighborhood 
of input point values. The coefficients of the defined curve and the 

median of the neighborhood are used to compute the smoothed point 
value corresponding to the median of the neighborhood. The neighbor- 
hood size is shifted from one point to the next until each input point 
has a corresponding smoothed point value. 

Each smoother being used in this thesis requires that a neighbor- 
hood be indicated by the user, but each of these smoothers calls the 
neighborhood size by a different name as discussed in this section. 
The term 'neighborhood size 1 , i.e. the number of point values involved 
in the averaging, can be used by any of the smoothers being discussed 
in this thesis. The Moving Average smoothers use the variable M to 
indicate the neighborhood size. The LOWESS smoother uses the 
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variable F, as discussed the above paragraph. Supersmoother uses a 
value called SPAN which is equivalent to the F value of LOWESS . The 
Split Linear Fit smoother uses the term 'window size' which is equiva- 
lent to the M value of the Moving Average smoothers. 

C. TESTING AND RESULTS- — -LINEAR UNDERLYING FUNCTION 

The first test posed on the Supersmoother and the Split Linear Fit 
algorithms was to detect linear trends in a data set which does have a 
linear trend. Figure 4.1 shows Test Set One which consists of 200 data 
points produced from the following equation: 

Y = X + Normal (0,1) noise, 0<X*$200 . (4.3) 

The values produced by this function are in tabular form in Appendix 

D. Figure 4.2 shows the results from doing a linear regression on Test 

Set One. It is obvious that the linear regression curve and the true 
linear curve do not coincide. A Confidence Interval Test on the coeffi- 
cients produced by the linear regression reveals that the Y-intercept 
coefficient, 0.0023573 is not significantly different from zero with a 

Confidence Level greater than 0.8. The slope coefficient has a 

Confidence Level less than 0.001 that it is not significantly different 
from zero. Therefore, the linear regression curve can be reduced to 

Y = 1.0104X which has a standard deviation of 0.031 which includes the 
true linear relationship, Y = X. 

The LOWESS smoothing results are shown in Figure 4.3. Since Test 
Set One appears to be linear, the linear option of LOWESS is used. 
There is little visible difference between the results produced by using 
the robust option and the results produced by using the non-robust 
option, i.e. compare the left-hand smooth plots with the right-hand 
smooth plots of Figure 4.3. This is not surprising since there are no 

outliers in this artificial data. The graphs in Figure 4.3 show that as 

the F value is increased, the curve produced gets smoother, i.e. as the 
neighborhood size increases the curve gets smoother because the bias 
increases and the variance decreases. All F values greater than or 
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Figure 4.1 Test Set One. 
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Figure 4.2 Test Set One: Linear Regression. 



equal to 0.5 returned the same smooth point values and thus have the 
same plot. 



65 



o 100 200 0 100 200 



LOWESS (UN FIT) WITH F « .05 



LOWESS (UN FIT) WITH F = .05 




- I i i J 

50 100 150 200 

X 




LOWESS (UN FIT) WITH F - .3 



LOWESS (UN FIT) WITH F - .3 





LOWESS (UN FIT) WITH F - .5 LOWESS (UN FIT) WITH F * .5 





Figure 4.3 Test Set One: LOWESS Smoothing. 
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Therefore, an individual who has little knowledge of the LOWES S 
theory should get quick, reasonable results if the F value used is 
between 0.33 and 0.66 as suggested by Chambers, etal [Ref. 3: p. 98]. 
The LOWES S smoothing program may have to be run several times 
before a straight line is produced. 

In Figure 4.4 are shown the smoothing results produced by the 
Supersmoother algorithm. The graphs on the left-hand side display the 
curves produced when only one span value is used. The graphs on the 
right are the results of using three span values during each run of 
Supersmoother. The two top right graphs in Figure 4.4 illustrate the 
difference between robust smoothing, i.e. ALPHA= 0.0, and noil-robust 
smoothing, i.e. ALPHA= 10.0. The single span value curves in Figure 
4.4 are quite similar to the smooth curves produced by LOWESS. The 
reason for the similarity is that both LOWESS and Supersmoother are 
central smoothers, i.e. the smooth point value is the result of aver- 
aging over the points in the neighborhood. 

When three span values are used the smooth points generated by 
Supersmoother converge much faster to the underlying linear function 
than the smooth points generated by LOWESS. Therefore, the 
Supersmoother algorithm traces very well the linearity of a data set 
with linear trends . 

When the Split Linear Fit algorithm is used with only one window 
size, the resulting curves, shown on the left-hand side of Figure 4.5, 
are not much different from the curves produced by LOWESS and 
Supersmoother. The right-hand graphs of Figure 4.5 illustrate that 
when the Split Linear Fit algorithm is given more than one window size, 
the generated smooth point values do not converge to the linear under- 
lying function as fast as Supersmoother . The smallest window size, 

i.e. 10 and 15, in each case has a great impact on the shape of the 
smooth curve, because with few points in the window the outliers 
receive higher weights than in ! large f windows. This illustrates that 
this smoothing algorithm is designed to place more emphasis on the 
outlying data points in the data. Equation 1.4 explains that smaller 
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SUPERSMOOTH WITH SPAN =* .05 



SUPERSMOOTH WITH SPAN =* .05, .3, .5 





SUPERSMOOTH WITH SPAN - .3 SUPERSMOOTH WITH SPANS « .05, .3, .5 





SUPERSMOOTH WITH SPAN * .5 



SUPERSMOOTH WITH SPANS » .3. .5, .8 





Figure 4.4 



Test Set One 



Smoothing With Supersmoother 



window sizes will have less bias and thus show more shape, which is 
the case shown in the right-hand plots of Figure 4.5. Therefore, the 
user's selection of the smallest window size will determine the degree of 
convergence toward the linear underlying function. 

Figure 4.6 exhibits smoother, more linear curves than were shown 
in Figure 4.5, the reason being that larger window sizes are used, i.e. 
increase of bias and decrease of the variance. The bottom graphs of 
Figure 4.6 demonstrate the effect of increasing the minimum window 
size, MNWNSZ. In Chapter II of this thesis the use of MNWNSZ is 
discussed. The author of this thesis, after using the Split Linear Fit 
for several months, recommends that the size of MNWNSZ be less than 
one-half the size of the smallest window size being used. The bottom 
left graph of Figure 4.6 illustrates the distorting effect produced when 
this recommendation is not followed. The Split Linear Fit algorithm 
does a good job of depicting data with a linear trend, but the user 
needs to understand the theory behind the Split Linear Fit, e.g. the 
window sizes had to be increased in order to produce a smoother curve, 
in order to achieve acceptable results. 

Another measure of performance that can be used in verifying the 
efficiency of a smoother is the sum of squared residuals. Table 1 
shows the sum of squared residuals for the 'best' fitting linear curves 
produced by each smoother, where 'best' is supported by a plot of the 
smooth curves shown in this chapter. 

All the fits listed in Table 1 produce a fairly straight line which is 
close to the true underlying function, Y = X. Other fitted curves do 
produce lower values of sum of squared residuals but the plotted curve 
deviates from a straight line, e.g. the plot produced by Supersmoother 
with SPAN(s) = 0.05, 0.3, 0.5 and ALPHA = 0.0 is not very straight, 
but the sum of squared residuals is 204.6112056. The decrease in the 
sum of squared residuals is due to an increase of bias. Supersmoother 
performed almost as well as the Linear Regression and LOWESS, but 
three neighborhood sizes had to be used, instead of one. The Split 
Linear Fit smoother did not do as well as Supersmoother for the 
following reasons: 
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SPLIT UNEAR FIT. WNSZ * 10 



SPUT UNEAR RT, WNSZ(S)»10. 60. 100, 160. 200 





SPLIT UNEAR FTT, WNSZ =■ 60 SPUT UNEAR FTT, WNSZ(S)- 10, 12, U, 16 
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Figure 4.5 Test Set One: Smoothing With Split Linear Fit. 
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SPLIT LINEAR FIT, WNSZ(S)- 25, 50, 100, 150 SPLIT LINEAR FIT, WNSZ(S)- 160. 170. 180, 190 





SPLIT LINEAR FIT, WNSZ(S)- 25, 50, 100, 125 SPLIT UNEAR FIT, WNSZ(S)» 160, 170, 180, 190 





Figure 4.6 Test Set One: Split Linear Fit 

Change of Smallest Window Size and MNWNSZ. 

1. Neighborhood sizes larger than those used by the other 
smoothers had to be used by Split Linear Fit before a smooth 
curve could be produced, therefore, this smoother is slower 
than the other smoothers in converging to a known underlying 
function. 

2. The sum of squared residuals value produced by the best, 
Split Linear Fit curve is the largest of all the values, there- 
fore, this curve is not as accurate as the other 'best' curves. 
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TABLE 1 
TEST SET ONE: 

SUM OF SQUARED RESIDUALS OF THE BEST FITS 



Type of Fit 

Linear Regression 
LOWESS, robust. F= 0.5 
LOWESS, non-robust, F= 0.5 
Supersmoother, ALPHA= 10.0 
SPAN(s) = 6.05. 0.3, 0.5 
Split Linear Fit. MnWN$Z= 50 
WNSZ(s)= l6o, 170, 180, 190 



Sum of Squared 
Residuals 

205 . 94074 
205.92350 
205.82359 
205.84026 

206.5657 



TABLE 2 

TEST SET ONE: COMPUTER CPU CONSUMED 



Type of Fit 

Linear Regression 
LOWESS, robust, F= 0.5 
LOWESS, non-robust, F= 0.5 
Supersmoother, ALPHA= 10.0 
SPAN(s)= 6.05, 0.3, 0.5 
Split Linear Fit, MnWN$Z= 50 
WNSZ(s)= 1 (j0, 170, 180, 190 



CPU Consumed 
(in Seconds) 



9.73 

3.36 

1.55 

2.28 



In Table 2 are listed the Central Processing Units, i.e. unit of time 
used by the IBM 3033 computer in processing a program, consumed 
when the smoothing techniques listed in Table 1 are used. In order to 
be consistent in the CPU measurements, each smoother was used to 
smooth the same data set and place the smoothed output in an APL 
variable. The CPU times listed in the table indicate that the advanced 
smoothers do better than most of the other smoothers, but the improve- 
ment is only in seconds. Therefore, a user trying to select between 
the smoothers should balance this saving in CPU time with the cost of 
computer and personnel time, before deriving a conclusion. 
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D. TESTING AND RESULTS SMOOTH CURVATURE IN UNDERLYING 

FUNCTION 



The second test is designed to test how the Supersmoother and the 
Split Linear Fit algorithms perform on a data set which has an under- 
lying function with smooth curvature, i.e. the change from one point to 
the next is not abrupt. Figure 4.7 displays Test Set Two which 
consists of 200 data points generated with the following equation: 



Y = 




Norma] (0,1) noise, 0<XsJ200 . 



(4.4) 



The values generated by this function and used in this section are in 
tabular form in Appendix D. 




Figure 4.7 Test Set Two. 



Figure 4.8 shows the curve fitting results of third degree polyno- 
mial curve fit to Test Set Two. Confidence Interval Tests on the coef- 
ficients of the equation shown on Figure 4.8 reveal that the coefficients 
are not significantly different from zero with a Confidence Level of less 
than 0.001, thus the coefficients should be accepted. 
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TEST SET TWO: 3 POLYNOMIAL CURVE FIT 




Figure 4.8 Test Set Two: Third Degree Polynomial Curve Fit. 

Figure 4.9 shows the results of the Spline fit, as previously 

discussed this GRAFSTAT function requires that the user enter the 

maximum sum of squared residuals value. In Figure 4.9 0 indicates 
that the smooth function should have a sum of squared residuals less 
than or approximately equal to the second parameter. 

Figure 4.10 shows the smooth curves produced from using the 

Equal-Weight Moving Average smoother. As usual a larger neighbor- 
hood size, M, results in a smoother curve. As stated in Chapter I, the 
disadvantage of using the Equal-Weight Moving Average smoother is that 
smooth data points are dropped from the ends of the output data set. 
This is illustrated in Figure 4.10 where with M=60, 30 points are 

dropped from each end. 

Figure 4.11 shows the results produced by the Cosine- Weighted 

Moving Average smoother. Since this smoother is an extension of the 
Equal-Weight Moving Average, it can be seen that as the neighborhood 
size, M, is increased smooth point values are also dropped from the 

ends of the data set. In addition, this figure illustrates how the 

smooth curve converges toward the true underlying function and then 



74 



TEST SET TWO: SPLINE FTT WITH PARAMETERS (0.200) 



TEST SET TWO: SPLINE FIT WITH PARAMETERS (0.203) 





Figure 4.9 Test Set Two: Spline Fit. 



EOUAL-WEIGHT MOVING AVERAGE. M- 10 EQUAL-WEIGHT MOVING AVERAGE. M» 60 





Figure 4.10 Test Set Two: 

Equal- Weight Moving Average Smoothing. 



away from this same underlying function, i.e. the variance decreases 
but bias increases beyond a certain point. 
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COSINE WTD MOV AVG WITH M- 



COSINE WTO MOV AVG WITH M- 61 




COSINE WTD MOV AVG WITH M- 101 




COSINE WTO MOV AVG WITH M- 161 





. Figure 4.11 Test Set Two: 

Cosine- Weighted Moving Average Smoothing. 

The LOWESS smooth results are shown in Figure 4.12. Since Test 
Set Two appears to be non-linear, the quadratic fitting option of 
LOWESS is used. Only the robust cases are shown since the results 
are basically the same as the non-robust cases. The best fitting curve 
appears to be the smooth curve produced by using F = 0.5. 

The Supersmoother results are displayed in Figure 4.13. The 
smooth curves produced by using three span values tend to maintain 
the shape of the underlying function across most span values. 
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Figure 4.12 Test Set Two: LOWESS Smoothing. 



However, if only one span value is used, the shape of the smooth curve 
approaches a straight line as the size of the span value approaches the 
value of 1.0. The top right and the middle right graphs illustrate the 
difference between using robust smoothing, i.e. ALPHA= 0.0, and 
non-robust smoothing, i.e. ALPHA= 10.0. Therefore, it is best to use 
Supersmoother with three span values and ALPHA=0.0. 

Figure 4.14 displays a radical difference, between using a single 
window size and several window sizes as input to the Split Linear Fit 
algorithm. As stated in Chapter III, the main purpose of the Split 
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SUPERSMOOTH WITH SPAN - 05 



SUPERSMOOTH WITH SPANS » .05. 3. .5 





SUPERSMOOTH WITH SPAN =■ .3 



SUPERSMOOTH WITH SPANS =» .05. .3. .5 





SUPERSMOOTH WITH SPAN = .5 



SUPERSMOOTH WITH SPANS =» 3. .5. .3 





Figure 4.13 



Test Set Two 



Smoothing With Supersmoother 



SPLIT UNEAR FIT, WNSZ » 10 



SPUT UNEAR FIT, WNSZ(S) -10, 60. 100, 160, 200 





SPLIT UNEAR FIT, WNSZ - 60 




SPUT UNEAR FIT, WNSZ(S)- 10. 12. 14, 16 




SPUT UN EAR FIT, WNSZ - 1 00 



SPUT UNEAR FIT. WNSZ(S)» 15. 17. 19. 21 





Figure 4.14 



Test Set Two 



Smoothing With Split Linear Fit 



SPLIT UNEAR FIT. WNSZ(S)- 25. 50, 100, 150 SPUT LINEAR FIT, WNSZ(S)=- 160, 170, 180. 190 





SPUT UNEAR FIT, WNSZ(S) 25, 50. 100, 125 



SPUT UNEAR FIT, WNSZ(S)« 160, 170, 180, 190 




X 




Figure 4.15 Test Set Two: Split Linear Fit 
Change of MNWNSZ. 



Linear Fit smoother is to disclose sudden abrupt changes in a data set. 
By using a single window size, the algorithm does not produce enough 
information about the true shape of the given data set, and the result 
is the deviating smooth curves on the left side of Figure 4.14. The 
use of several window sizes produces more information about the raw 
data, and the result is smoother curves; see the graphs on the right of 
Figure 4.14. Figure 4.15 shows the effect produced when the size of 
MNWNSZ is changed and when large window are used. The smoothed 
output produced by the Split Linear Fit smoother never converges to 
the true underlying function no matter what window sizes are used. 
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The sum of squared residuals corresponding to the 'best' fitting 
curves are listed in Table 3. Each of the fits listed in Table 3 has a 
different degree of convergence toward the true underlying function, 

Y = COS(X/25). Other fits not listed in Table 3 but discussed in this 
section produced smaller sum of squared residuals, however the smooth 
curve did not resemble the true underlying function to a high degree. 
Supersmoother performs fairly well when given a set of data with a 
smooth curvature. The Split Linear Fit smoother did a poor tracing the 
true underlying curve very well, when compared to other more simple 
smoothers. 



TABLE 3 

TEST SET TWO: 

SUM OF SQUARED RESIDUALS OF THE BEST FITS 



Type of Fit 



Sum of Squared 
Residuals 



Third Degree Polynomial Curve Fit 
Spline Fit (0, 20o) 

Equal- Weight Moving Average, M= 60 
Cosine -Weighted Moving Average, M= 61 
LOWES S, robust. F= 0 5 
LOWESS, non-robust, F= 0.5 
Supersmoother, ALPHA= 0.0 
SPAN(s)= (3.05. 0.3, 0.5 
Split Linear Fit, lviNWN^Z= 5 
WNSZ(s)= 23, 50, 100, 150 



219.14846 

203.70213 

159.04903 

158.38194 

207.26 

207.133 

209.28674 

209.51462 



Table 4 shows the CPU times consumed by the smoothers listed in 
Table 3. The Cosine-Weighted Moving Average smoother, in addition to 
producing a very good sum of squared residuals value, is very fast in 
generating the smoothed results. The advanced smoothers were much 
slower than the Cosine -Weighted Moving Average, but were faster than 
LOWESS. 
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TABLE 4 

TEST SET TWO: COMPUTER CPU CONSUMED 



CPU Consumed 



Type of Fit (in Seconds) 

Third Degree Polynomial Curve Fit 0.96 

Spline Fit (0, 203) 12.15 

Equal-Weight Moving Average, M= 60 0.06 

Cosine-Weighted Moving Average, M= 61 0.22 

LOWESS, robust. F= 0.5 21.64 

LOWESS, non-robust, F= 0.5 7.13 

Supersmoother, ALPHA= 10.0 1.55 

SPAN(s)= (3.05, 0.3, 0.5 

Split Linear Fit, NfNWN!iiZ= 50 2.37 

WNSZ(s)= l£0, 170, 180, 190 



E. TESTING AND RESULTS ABRUPT CHANGES IN CURVATURE IN 

UNDERLYING FUNCTION 



The third test examines the performance of the Supersmoother and 
the Split Linear Fit algorithms on a data set which includes a triangular 
function. Test Set Three is shown in Figure 4.16 and the point values 
are displayed in table form in Appendix D. The following equation was 
used to generate Test Set Three: 



1.0 -r 0.06X 

8.0 - 0.08X 



Y = 1 



-0.808 + 0.008X 



0.392 



, if 0<X^50 
, if 50<Xsa00 
, if 100<X^150 
, if 150<XsS200 . 



(4.5) 



In order to check the data set against equation 4.5 a linear regres- 
sion was done on each part of the above equation and the results are 
displayed in Figure 4.17. This figure shows that the regression equa- 
tions deviate from the true equations. A Confidence Interval test done 
on each of the coefficients indicates that the coefficients are signifi- 
cantly different from zero, therefore, the equations produced by the 
regression are accepted. 
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The curve resulting from third degree polynomial fit on Test Set 
Three is shown in Figure 4.18. A Confidence Interval test done on the 
coefficients reveals that the coefficients produced by this fitting tech- 
nique are not significantly different from zero with a Confidence Level 
less than 0.001. Thus all the coefficients in the equation shown in 
Figure 4.18 are accepted. 




Figure 4.16 Test Set Three. 



A plot of the point values generated by a Spline Fit with the sum of 
squared residuals required to be no greater than 204 is shown in 
Figure 4.19. 

The smooth point values generated on two runs of the Equal-Weight 
Moving Average are plotted, and the curves are displayed in Figure 
4.20. 

The curves produced by plotting the smooth point values generated 
by the Cosine-Weighted Moving Average smoother are displayed in 
Figure 4.21. 
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FIRST FIFTY POINTS OF TEST SET THREE 




THIRO FIFTY POINTS OF TEST SET THREE 




SECOND FIFTY POINTS OF TEST SET THREE 




FOURTH FIFTY POINTS OF TEST SET THREE 
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Figure 4.17 Test Set Three Broken Into Four Linear Sections. 

The LOWESS smoothing results are displayed in Figure 4.22. The 
best fit is produced by the run with F= 0.3. The smooth curve almost 
coincides with the true underlying function. 

Figure 4 . 23 contains the plots of the smooth points produced by 
Supersmoother. As can be seen, the best fit is when three span values 
are used in the smoothing, i.e. SPAN(s)= 0.05, 0.3, 0.5, which happen 
to be starting values recommended by the Friedman and Stuetzle 
[Ref. 9: p. 9]. The smooth curve tends to depict the true underlying 
function very well. The top right graph and the middle right graph 
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TEST SET THREE: 3 POLYNOMIAL CURVE FIT 




Figure 4.18 Test Set Three: 
Third Degree Polynomial Curve Fit. 



TEST SET THREE: SPLINE FIT WITH PARAMETERS (0.204) 




Figure 4.19 Test Set Three: Spline Fit. 

show the great disparity between using non-robust smoothing, i.e. 
ALPHA= 10.0, and robust smoothing, i.e ALPHA = 0.0. Thus with the 
Supersmoother three span values with a ALPHA=0.0 are necessary to 
get useful results. 
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Figure 4.20 Test Set Three: 

Equal- Weight Moving Average Smoothing. 

By examining the plots in Figure 4.24 and 4.25, it is obvious that 
the Split Linear Fit smoother produces results which are often too 
erratic and not useful. Outlying points have too much influence on the 
output produced by this smoother. The only plot without any drastic 
deviations from the curvature of the underlying function is the top left 
plot in Figure 4.25. 

Table 5 shows for the third time that in addition to not producing 
good smooth curves which depict the underlying function very well, the 
advanced smoothers do not produce sum of squared residuals values as 
good as the baseline smoothers . 

Table 6 shows that the advanced are consistently using the same low 
amount of CPU time. The LOWESS has fluctuated in CPU usage, but 
has always used the most CPU. The Cosine-Weighted Moving Average 
smoother for the second time has generated a very good sum of squared 
residuals value and has used the least amount of CPU. 
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COSINE WTO MOV AVG WITH M- 1 1 



COSINE WTO MOV AVG WITH M- 61 





COSINE WTO MOV AVG WITH M- 101 COSINE WTO MOV AVG WITH M- 161 





Figure 4.21 Test Set Three: 
Cosine- Weighted Moving Average. 



F. CONCLUSIONS 

The advanced smoothers, Supersmoother and Split Linear Fit, are 
quite complex to thoroughly understand and require that the user enter 
many parameters . The interrelationship between the parameters is not 
clear, and the results are difficult to control. For example, the 
smallest neighborhood size used by the advanced smoothers has more 
influence on the output than the other neighborhood sizes, but this 
value can not be too big or the output will get distorted. In addition, 
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LOWESS (QUAD FIT) WITH F = OS 



LOWESS (QUAD FIT) WITH F = .3 





LOWESS (QUAO FIT) WITH F « .S 



LOWESS (QUAD FIT) WITH F - .8 





Figure 4.22 Test Set Three: LOWESS Smoothing. 

to the three or more neighborhood sizes that the user must think about 
and calculate, a fourth value must be considered, i.e. the 
Supersmoother's ALPHA value and the Split Linear Fit's MNWNSZ value. 
Changing anyone of these values in the advanced smoothers produces 
radical changes in the shape of the fitted curve, leaving the user 
confused as to which values to change and by how much. Friedman 
and Stuetzle recommend that the neighborhood sizes be between 5 and 
50 percent of N, where N is the number of points to be smoothed. They 
also claim that "savings are substantial" [Ref. 9: p. 5], i.e. in the 
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SUPERSMOOTH WITH SPAN - .05 



SUPERSMOOTH WITH SPAN =» .05, .3, .5 





SUPERSMOOTH WITH SPAN - .3 SUPERSMOOTH WITH SPAN - .05, .3, .5 





SUPERSMOOTH WfTH SPAN - .5 SUPERSMOOTH WITH SPAN(S) - .3, .5. .3 





Figure 4.23 Test Set Three: 
Smoothing With Supersmoother. 
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SPLIT UNEAR FIT, WNSZ - 10 



SPUT UNEAR FIT, WNSZ(S)- 10. 12. 14, 16 





SPUT UNEAR FIT, WNSZ - 60 



SPUT UNEAR FIT, WNSZ(S)- 15, 17, 19. 21 





SPUT UNEAR FIT, WNSZ - 100 



SPUT UNEAR FIT, WNSZ(S)-10. 60. 100, 160, 200 





Figure 4.24 Test Set Three: 
Smoothing With Split Linear Fit. 
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SPUT UNEAR RT, WNSZ(S)=* 25, 50. 100, 150 




SPUT LINEAR RT, WNSZ(S)=* 160, 170, 180, 190 
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Figure 4.25 Test Set Three: Split Linear Fit, 

Change of MNWNSZ. 



time required to find a desired smoothed curve is greatly reduced. 
The SPAN values used in this thesis meet this criteria. The program 
runs fast, but the sum of squared residuals values produced are not as 
good as the values produced by the simpler, more user friendly 
smoothers, one of which uses far less CPU. 

McDonald and Owen never really give any guidance on the number 
of window sizes to use except that they used "several (typically three 
to five)" [Ref. 10: p. 2] in their testing of the Split Linear Fit 
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TABLE 5 

TEST SET THREE: 

SUM OF SQUARED RESIDUALS OF THE BEST FITS 



Type of Fit 

Third Degree Polynomial Curve Fit 
Spline FiF (0, 204) 

Equal-Weight Moving Average, M= 60 
Cosine-Weighted Moving Average, M= 61 
LOWESS, robust. F= 0.3 
LOWESS, non-robust, F= 0.3 
Supersmoother, ALPHA= 0.0 
SPAN(s)= 6.05, 0.3, 0.5 
Split Linear Fit, NfNWN!bZ= 5 
WNSZ(s)= 26, 50, 100, 150 



Sum of Squared 
Residuals 

270.05338 

204.66025 

173.46090 

156.80284 

204.173 

210.571 

204.70795 

206.59216 



TABLE 6 

TEST SET THREE: COMPUTER CPU CONSUMED 

CPU Consumed 



Type of Fit (in Seconds) 

Third Degree Polynomial Curve Fit 0.98 

Spline Fif (0, 203) 12.95 

Equal-Weight Moving Average, M= 60 0.07 

Cosine-Weighted Moving Average, M= 61 0.22 

LOWESS, robust. F= 0.5 15.19 

LOWESS, non-robust, F= 0.5 5.02 

Supersmoother, ALPHA= 10.0 1.53 

SPAN(s)= 6.05, 0 . 3 0.5 

Split Linear Fit, MnWN$Z= 50 2.37 

WNSZ(s)= l60, 170, 180, 190 



smoother. The smooth curves produced by this smoother never are 
consistent, i.e. follow a pattern which can be used as a guide toward 
the desired smooth curve. Figure 4.24 is a good example of this 
problem. The smooth curve is totally different from one graph to the 
next. Results may be obtained fast, but a user wants good results. 
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In conclusion, even though the advanced smoothers are fast, they 
do not perform as well as the LOWESS smoother or the faster 
Cosine-Weighted Moving Average smoother in depicting simple functional 
relationships. The consistently good sum of squared residuals values 
and smooth curves produced by these simple smoothers favor their use 
over the advanced smoothers. 
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V. EVALUATION/ APPLI C A TI ON ADV ANCE D SMOO TH I N G 

ALGORITHMS 



A. GENERAL 

The evaluation of Supersmoother and Split Linear Fit continues in 
this chapter. As stated in the previous chapter, a smoothing algorithm 
is used to extract the underlying relationship from a data set. 
Smoothing is especially useful if the underlying relationship is complex, 
i.e. too difficult to describe mathematically or use simple (global) least 
squares regression. The small and simple data sets in the previous 
chapter are easy to smooth since the shape of the underlying function 
is quite visible in a scatterplot of the raw data, see Figures 4.1, 4.7, 
and 4.16. Least squares regression is thus easy to apply to these data 
sets. On the other hand, the least squares method does not adequately 
smooth the data set tabulated in table form in Appendix D. A plot of 
this data is shown in Figure 1.1. The great amount of variability 
inherent in the data set causes the regression technique to be inade- 
quate, i.e. the raw data is very erratic. Most data sets collected from 
real populations/situations do not have a constant nor smooth variance, 
thus regression techniques fail to be adequate, i.e. the fitted curve 
is too smooth and the sum of squared residuals is too high. 

In this chapter the data set utilized is the daily sea-surface temper- 
atures at Granite Canyon, just south of Point Sur, California [Ref. 10]. 
The data displayed in Figure 1.1 is the first of thirteen years of sea- 
surface temperature data collected at this location. This data set defi- 
nitely does not have a constant variance, but it seems to exhibit some 
periodicity and to have some points of discontinuity, notably a very 
sudden and strong drop in temperature because of current up wellings 
in the spring. This data set was selected for final evaluation of 
Supersmoother and Split Linear Sit because of the following reasons: 

1. this data set has been a subject of intense data analysis 

[Ref. 4]; 
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2. the characteristics exhibited by this data set, mentioned above, 
and; 

3. it was easily accessible. 

B. METHODOLOGY 

The procedure followed in this chapter in the evaluation of 
Supersmoother and Split Linear Fit is described in the previous chapter 
in the Methodology section. The procedure is outlined below: 

1. display and examine the data set to be smoothed; 

2. display and examine the smooth results produced by the base- 
line smoothing techniques, i.e. the Least Squares Regression, 
the Equal-Weight Moving Average, the Cosine-Weighted Moving 
Average, and LOWESS; 

3. display and examine the smooth results produced by the 
advanced smoothers; 

4. compare these results to the results from 2 above. 

In the previous chapter the graphs showing the smooth points produced 
by any one smoothing technique were displayed together, e.g. see 
Figure 4.3 which has all the Test Set One LOWESS smoothing results. 
In this chapter it is best to display together the smoothing results that 
use equivalent neighborhood sizes, as described below: 

1. the neighborhood size in the Moving Average smoothing tech- 
nique is called M, which is equivalent to window size used in 
Split Linear Fit; 

2. the span value used by Supersmoother is equivalent to the F 
value used by LOWESS, both are computed as the ratio of the 
neighborhood size to the number of data points to be smoothed. 

This change in plot display makes it easier to subjectively decide 
the adequacy or usefulness of the advanced smoothing algorithms. The 
word 'subjectively' is used as the measure of effectiveness because, as 
mentioned before, the decision to use one smooth curve over another is 
basically based on the user's needs and desires and on the curve's 
appearance. In order to bare the comparison a more concrete statistical 
analysis, the sum of squared residuals of the different plots will be 
calculated. 
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The complete sea-surface temperature data set contains 4380 data 
points, i.e. a sea-surface temperature corresponding to each day for 
the period of March 1, 1971 to February 1983, excluding the dates of 
February 29. Only the first 671 data points which correspond to 1971 
and 1972 will be used in the evaluation in this chapter so as to have a 
manageable input data set. 

Figure 5.1 shows a scatterplot of the data set used in this chapter. 
The axis scales shown in this figure will be the standard axis scales to 
be used in all the graphs corresponding to this chapter. The vertical 
axis which is easiest to describe displays the sea- surface temperature 
range in degrees centigrade. The horizontal axis displays the day that 
the temperature was measured. The numbers shown indicate the 
Calendar date, i.e. the first digit indicates the year, assuming that the 
corresponding decade is known. Recall that the abbreviated data set 
used in this chapter was collected in the 1970's. The next three digits 
indicate the day of the year, for example '080' means the 80^ day of 
the year. The vertical dashed grid lines indicate the change in season 
during the year, e.g. at 1080 winter 1971 ends and spring 1971 begins. 
The solid tic marks indicate the end of a month. All the graphs begin 
with January. This is the reason that in Figure 5.1 the first two bins, 
January and February, are empty since the data set begins with March 
1, 1971. Finally, since the abscissa are measured in days, the neigh- 
borhood sizes used in this chapter will also be in days, e.g. a neigh- 
borhood size of 5 will correspond to a period of 5 days. 

C. TESTING AND RESULTS 

Figure 5.1 exhibits the data set that is evaluated in this chapter. 
From this graph alone, it could be deduced that there are two tempera- 
ture cycles present in the data set. The first cycle peaks near the 
end of summer 1971, and second cycle peaks just after the beginning of 
fall 1972. Another point of view that the analyst may take by looking 
at Figure 5.1 is to analyze the point distribution between the 
peaks, e.g. the data points from beginning of fall 1971 to beginning of 
fall 1972. Within this period the temperature appears to follow a cyclic 
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process, but with a smaller period, e.g. from 1355 to 2080 the temper- 
ature increased and then decreased and from 2080 to 2172 the opposite 
is reflected. Therefore, one analyst may try to depict the intra-annual 
cycle, while another may try to display the inter -annual cycle, while 
still another may try to show both cycles or something else. The anal- 
ysis in this chapter will concentrate on the intra-annual cycle as 
depicted by the advanced smoothers and compare these results to 
curves produced by previously validated and well-accepted smoothing 
techniques. 



OAILY SEA SURFACE TEMPERATURE AT GRANITE CANTON 
MARCH 1. 1971 TO 0ECEM8ER 31. 1972 




col :i ,i ,i ;i i i ii j i .i i . i . i-i. ii — i,. > ii ; ,i u i 

1030 1172 126 * 1333 2030 2172 226 * 2333 

JULIAN CALENDAR DATE 



Figure 5.1 Data Set For Practical Application. 



Figure 5.2 shows the data divided into the two annual periods in 
order to display the intra-annual characteristics of the data. With 
these displays the intra-annual variability is more detectable and a 
different view of smoothing the data can be taken, i.e. the data can be 
smoothed to show the cyclic effect within the seasons, for example 
between 1172 and 1264. 
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DAILY SEA-SURFACE TEMPERATURE AT GRANITE CANYON. CA DAILY SEA-SURFACE TEMPERATURE AT GRANITE CANYON. CA 
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Figure 5.2 Data Set For Practical Application' 

Divided Into Two Parts. 

/ 

Figures 5.3 through 5.8 show the smooth curves produced by the 
smoothers being compared in this chapter. A different neighborhood 
size was used to smooth the data in each case. One of the three span 
values used in the Supersmoother corresponds to the neighborhood size 
that is used by the other smoothers within a figure; the same applies to 
the Split Linear Fit. In order to properly evaluate the adequacy of the 
smoothing produced by the advanced smoothers, it is best to compare 
similar smooth curves produced by both the advanced smoothers and the 
baseline smoothers. 

As the neighborhood size increases, the produced smooth curve gets 
smoother. This effect is described by the analytical equation 1.4. In 
addition, as the neighborhood size increases, the Moving Average type 
smoothers loose more smooth data points from each end; this effect was 
described in Chapter I. 
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After each figure is a table which compares the sum of squared 
residuals corresponding to the curves displayed in the figure. This 
gives the analyst a better and more statistically supported comparison 
between the smooth output produced by the different smoothers. In 
addition, a table showing the CPU time used by each of the smoothers 
follows the sum of squared residuals table. 

Figure 5.3 presents the smooth curves which are produced when the 
neighborhood size is small,i.e. neighborhood size of 5. The curves are 
very erratic and somewhat unpleasant to the eye. Because some of the 
peaks are too close to each other, the short term cyclic effect is over- 
emphasized and rendered useless. In Table 7 it can be seen that the 
sum of squared residuals that correspond to the advanced smoothers are 
not as low as those corresponding to the simple smoothers. In fact the 
sum of squared residuals corresponding to the Supersmoother is almost 
three times that produced by LOWES S, w’hich is the lowest value. 
Table 8 shows that the Cosine-Weighted Moving Average smoother is 
about three times faster than Supersmoother and about 5 times faster 
than Split Linear Fit, yet more accurate. 

In Figure 5.4 the smooth plots are not as jagged as the ones shown 
in Figure 5.3. The reason for this difference is that the neighborhood 
size used to produce the smooth curves in Figure 5.4 is twice that of 
those used to produce the smooth curves in Figure 5.3. There are 
some slight differences between each smooth curve in Figure 5.4, but 
the differences are difficult to detect. The curve produced by 
Supersmoother seems to have the least amount of jagged peaks thus 
making it easier to count the increasing and decreasing cycles within 
each season. However, Table 9 shows that the sum of squared resi- 
duals produced by Supersmoother is not as good as those produced by 
the LOWESS, either robust or non-robust smooth curve. The analyst in 
this case has the choice of deciding whether to have a good smooth 
curve with a high sum of squared residuals or a not so smooth curve 
with a low sum of squared residuals. Table 10 shows that the 

Cosine-Weighted Moving Average smoother is much faster than either of 
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TEMPERATURE IN DEGREE CENT. TEMPERATURE IN DECREE CENT TEMPERATURE IN DEGREE CENT. 



SPUNE FIT, PARAMETERS (0.100) 



SMOOTHING WITH EOUAL-W0GHT MOVING AVERAGE. M- 5 






SMOOTHING WITH LOWESS. F- 0.00745 
ROBUST. UNEAR FITTING 




SMOOTHING WITH SUPERSMOOTHER. ALPHA- 0.0 
SPAN(S)=- 0.00745. 0.016393. 0.0175 




SMOOTHING WITH SPUT UNEAR FIT 
WNSZ(S)— 5, 11. 13; MNWNSZ=» 2 




Figure 5.3 Comparison of Smoothers 
Using Neighborhood Size of 5. 
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TABLE 7 




SUM OF SQUARED RESIDUALS USING 


NEIGHBORHOOD SIZE OF 


Type of Fit 


Sum of Squared 
Residuals 


Spline (0,100) 


100.47089 


Equal- Weight Moving Average, M= 5 


84 . 9068 


Cosine- Weighted Moving Average, M= 5 


48.0688 


LOWES S, robust, F= 0.00745 


49.1153 


LOWESS, non-robust, F= 0.00745 


37.3108 


Supersmoother, ALPHA= 0.0 

SPAN(s)= 0.00745, 0.016393, 0.0175 


105.7947 


Split Linear Fit, MNWNSZ= 2 
WNSZ(s)= 5, 11, 13 


60.5805 



TABLE 8 




CPU USAGE: NEIGHBORHOOD 


SIZE OF 5 


Type of Fit 


CPU Consumed 
(in Seconds) 


Spline (0,100) 


26.6 


Equal- Weight Moving Average, M= 5 


0.03 


Cosine- Weighted Moving Average, M= 5 


0.69 


LOWESS, robust, F= 0.00745 


16.07 


LOWESS, non-robust, F= 0.00745 


5.41 


Supersmoother , ALPHA= 0.0 

SPAN(s)= 6.00745, 0.016393, 0.0175 


2.27 


Split Linear Fit, MNWNSZ= 2 
WNSZ(s)= 5, 11, 13 


3.79 
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TEMPERATURE IN DEGREE CENT. TEMPERATURE IN DEGREE CENT. ^ TEMPERATURE IN DEGREE CENT. 



SPUNE FIT, PARAMETERS (0,200) 



SMOOTHING WITH EQUAL-WEIGHT MOVING AVERAGE. M- 13 





IMOOTHING WITH COSINE-WEIGHTED MOVING AVERAGE. M- 13 




SMOOTHING WITH LOWESS. F- 0.0175 
ROBUST. OUARORATIC FTTTING 




SMOOTHING WITH SUPERSMOOTHER. ALPHA- 0.0 
SPAN(S)=* 0.016393. 0.0175. 0.031296 




SMOOTHING WITH SPUT UNEAR FT T 
WNSZ(S)= 11. 13. 21; MNWNSZ=s 5 




Figure 5.4 Comparison of Smoothers 
Using Neighborhood Size of 13. 
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TABLE 9 




SUM OF SQUARED RESIDUALS USING NEIGHBORHOOD SIZE OF 

13 


Type of Fit 


Sum of Squared 
Residuals 


Spline (0,200) 


200.51872 


Equal- Weight Moving Average, M= 13 


236.86018 


Cosine-Weighted Moving Average, M= 13 


134.92327 


LOWESS, robust, F= 0.0175 


86.4623 


LOWESS, non-robust, F= 0.0175 


74.3993 


Supersmoother , ALPHA= 0.0 

SPAN(s)= (3.016393, 0.0175, 0.031296 


233.94982 


Split Linear Fit. MNWNSZ= 2 
WNSZ(s)= 1 i, 13, 21 


186.25445 



TABLE 10 




CPU USAGE: NEIGHBORHOOD 


SIZE OF 13 


Type of Fit 


CPU Consumed 
(in Seconds) 


Spline (0,200) 


38.0 


Equal-Weight Moving Average, M= 13 


0.04 


Cosine-Weighted Moving Average, M= 13 


0.73 


LOWESS, robust, F= 0.0175 


12.02 


LOWESS, non-robust, F= 0.0175 


3.95 


Supersmoother, ALPHA= 0.0 

SPAN(s)= 6.016393, 0.0175, 0.031296 


2.32 


Split Linear Fit. MNWNSZ= 2 
WNSZ(s)= li, 13, 21 


3.77 
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the advanced smoothers in addition to being the most accurate 
smoother . 

With a neighborhood size of little less than a month, i.e. 21 which 
is equivalent to 21 days since time is the unit of measurement of the 
abscissa in this data set, most of the smoothers produce smooth curves 
which have lost the jagged effect at the high and low points of the 
smooth curves, see Figure 5.5. In this figure, the smooth curve 
produced by Split Linear Fit displays its tendency to follow and empha- 
size outliers. Split Linear Fit results are so robust that the peaks are 
shown pointed and not round as displayed by the other smoothers. 

This effect is caused by the edge-detection weighting scheme of Split 
Linear Fit. As shown in Table 11, the sum of squared residuals 

produced by the advanced smoothers are higher than most of the other 
smoothers, even though a very similar smooth curve is produced by all 
the smoothers. The difference between the sum of squared residuals 
value corresponding to Split Linear Fit and Supersmoother, respec- 
tively, can be explained by the tendency of Split Linear Fit to follow 
outliers more closely than Supersmoother. The Split Linear Fit 
smoother lets the raw data dictate the shape of the smooth curve, 
therefore, the difference between the raw data and the smoothed data is 
smaller for the Split Linear Fit than for the Supersmoother. Table 12 
shows again that the Cosine-Weighted Moving Average smoother is 
faster, even though the sum of squared residuals value may not be the 

best. The Supersmoother and the Split Linear Fit smoothers have 

consistently maintained their usage of CPU. 

Figure 5.6 displays the results produced by using a neighborhood 
size equivalent to almost one month, i.e. 29 days. Though still quite 
similar, each smoother produces a visibly different smooth curve. The 
only exception is Equal-Weight Moving Average smoother which has 
suppressed the influence of the outliers. The shape of the input data 
is still being maintained by most of the smoothers, especially Split 
Linear Fit which is designed to do so. In Table 13, the sum of 
squared residuals values corresponding to the advanced smoothers do 
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SPUNE FIT, PARAMETERS (0.225) 



SMOOTHING WITH equal-weight MOVING AVERAGE, M- 21 





SMOOTHING WITH COSINE-WEIGHTED MOVING AVERAGE. M- 21 




SMOOTHING WITH LOWESS, F- 0.031296 
ROBUST, OUARDRATIC FITTING 




SMOOTHING WITH SUPERSMOOTHER. ALPHA- 0.0 
SPAN(S)— 0.0175. 0.031296, 0.04322 




SMOOTHING WITH SPUT UNEAR FIT 
WNSZ(S)- 13, 21. 29; MNWNSZ- 5 




Figure 5.5 Comparison of Smoothers 
Using Neighborhood Size of 21. 
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TABLE 11 




SUM OF SQUARED RESIDUALS USING NEIGHBORHOOD SIZE OF 

21 


Type of Fit 


Sum of Squared 
Residuals 


Spline (0,225) 


225.04812 


Equal-Weight Moving Average, M= 21 


331.967483 


Cosine -Weighted Moving Average, M= 21 


209.75158 


LOWESS, robust, F= 0.031296 


162.382 


LOWESS, non-robust, F= 0.031296 


142.891 


Supersmoother, ALPHA= 0.0 

SPAN(s) = (3.0175, 0.031296, 0.04322 


270.39113 


Split Linear Fit, MNWNSZ= 2 
WNSZ(s)= l6, 21, 29 


236.34410 



TABLE 12 




CPU USAGE: NEIGHBORHOOD 


SIZE OF 21 


Type of Fit 


CPU Consumed 
(in Seconds) 


Spline (0,225) 


43.87 


Equal-Weight Moving Average, M= 21 


0.07 


Cosine-Weighted Moving Average, M= 21 


0.76 


LOWESS, robust, F= 0.031296 


29.56 


LOWESS, non-robust, F= 0.031296 


10.03 


Super smoother, ALPHA= 0.0 

SPAN(s)= 6.0175, 0.031296, 0.04322 


2.25 


Split Linear Fit, MNWNSZ= 2 
WNSZ(s)= li, 21, 29 


3.7 
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TEMPERATURE IN DEGREE CENT. TEMPERATURE IN DEGREE CENT. w TEMPERATURE IN DEGREE CENT. 



SPUNE FIT. PARAMETERS (0,250) 



SMOOTHING WITH EQUAL-WEIGHT MOVING AVERAGE. M= 29 





SOOTHING WITH COSINE-WBGHTED MOVING AVERAGE. M- 29 




SMOOTHING WITH LOWESS. F- 0.04322 
ROBUST. QUARD RATIO FITTING 




SMOOTHING WITH SUPERSMOOTHER. ALPHA- 0.0 
SPAN(S)= 0.0175. 0.04322. 0.09091 




SMOOTHING 'WITH SPLIT UNEAR FIT 
WNSZ(S)= 13, 29. 61; MNWNSZ= 5 




Figure 5.6 Comparison of Smoothers 
Using Neighborhood Size of 29. 
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TABLE 13 




SUM OF SQUARED RESIDUALS USING 

2 9 


NEIGHBORHOOD SIZE OF 


Type of Fit 


Sum of Squared 
Residuals 


Spline (0,250) 


249.84103 


Equal-Weight Moving Average, M= 29 


411.15847 


Cosine- Weighted Moving Average, M= 29 266.83401 


LOWESS, robust, F= 0.04322 


235.881 


LOWESS, non-robust, F= 0.04322 


202.825 


Supersmoother, ALPHA= 0.0 

SPAN(s)= (3.0175, 0.04322, 0.09091 


265.40284 


Split Linear Fit. MNWNSZ= 2 
WNSZ(s)= l3, 29, 61 


211.79171 



TABLE 14 




CPU USAGE: NEIGHBORHOOD 


SIZE OF 29 


Type of Fit 


CPU Consumed 
(in Seconds) 


Spline (0,250) 


42.55 


Equal-Weight Moving Average, M= 29 


0.08 


Cosine- Weighted Moving Average, M= 29 


0.81 


LOWESS, robust, F= 0.04322 


34.21 


LOWESS, non-robust, F= 0.04322 


11.28 


Supersmoother, ALPHA= 0.0 

SPAN(s)= (3.0175, 0.04322, 0.09091 


2.28 


Split Linear Fit, MNWNSZ= 2 
WNSZ(s)= li, 29, 61 


3.78 
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not deviate as much as in the three past cases from the sum of squared 
residuals values corresponding to the other smoothers. Table 14 basi- 
cally follows the same explanation of Table 12. 

If a neighborhood of two months is used, the smooth curves shown 
in Figure 5.7 are produced. Each curve is now quite different from 
the other curves. It is now quite noticeable that the Moving Average 
type smoothers have lost smooth data points at the ends. LOWESS has 
been able to maintain the shape of the input data. The Split Linear Fit 
has made a good attempt to do as well with the edge -detecting 
weighting scheme. Since Supersmoother is a central smoother and the 
neighborhood size is larger than the intra- seasonal period, the smooth 
curve produced is quite ! smooth f , i.e. not jagged and abrupt. The sum 
of squared residuals values shown in Table 15 reflect the superiority of 
LOWESS over the other smoothers. The smooth curves displayed in 
Figure 5.7 substantiate even more LOWESS T s superior performance. 
LOWESS has a better sum of squared residuals value and a better 
smooth curve. Table 16 shows that LOWESS is almost the slowest of the 
smoothing techniques, but it has the lowest sum of squared residuals. 

The last figure, Figure 5.8, is shown basically to illustrate that the 
Moving Average smoothers have begun to deteriorate, i.e. loose too 
many smooth data points at the ends and deviate from the shape of the 
raw data. LOWESS and the advanced smoothers are still maintaining the 
general shape of the raw data, but are beginning to get too smooth. 
The Split Linear Fit smoother has made a good attempt to depict the 
outliers, even with a large neighborhood size, but the price paid is the 
return of the undesirable sharp peaks with plateau-like bases. If the 
neighborhood size is gradually increased, the resulting smooth curves 
will change from those in Figure 5.8 to smooth sinusoidal curves, and 
eventually to straight lines. The sinusoidal curves and the straight 
lines illustrate only general features about the raw data and defeat the 
purpose of smoothing. As shown in Table 17, where the neighborhood 
size is 91 days, the advanced smoothers finally produced the low sum of 
squared residuals values. This figure was made to illustrate that the 
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SPUNE FIT, PARAMETERS (0,300) 



SMOOTHING WITH EQUAL- WEIGHT MOVING AVERAGE. M* 61 





smoothing wrm cosine-wqghteo moving average, m- 6i 




SMOOTHING WITH LOWESS, F- 0.09091 
ROBUST, QUARO RATIO FITTING 




SMOOTHING WITH SUPERSMOOTHER, ALPHA* 0.0 
SPAN(S)* 0.04322, 0.09091, 0.13562 




SMOOTHING WITH SPUT LINEAR FIT 
WNSZ(S)* 29. 61. 91; MNWNSZ* 10 




Figure 5.7 Comparison of Smoothers 
Using Neighborhood Size of 61. 
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TABLE 15 




SUM OF SQUARED RESIDUALS USING NEIGHBORHOOD SIZE OF 

61 


Type of Fit 


Sum of Squared 
Residuals 


Spline (0,300) 


299.19197 


Equal-Weight Moving Average, M= 61 


517.92528 


Cosine-Weighted Moving Average, M= 61 


409 . 40077 


LOWESS, robust, F= 0.09091 


372.995 


LOWESS, non-robust, F= 0.09091 


372.263 


Supersmoother, ALPHA= 0.0 

SPAN(s)= 6.04322, 0.09091, 0.13562 


453.02390 


Split Linear Fit. MNWNSZ= 2 
WNSZ(s)= 26 , 61, 91 


423.35262 



TABLE 16 




CPU USAGE: NEIGHBORHOOD 


SIZE OF 61 


Type of Fit 


CPU Consumed 
(in Seconds) 


Spline (0,300) 


46.17 


Equal-Weight Moving Average, M= 61 


0.14 


Cosine-Weighted Moving Average, M= 61 


0.98 


LOWESS, robust, F= 0.09091 


51.78 


LOWESS, non-robust, F= 0.09091 


17.44 


Supersmoother, ALPHA= 0.0 

SPAN(s)= 6.04322, 0.09091, 0.13562 


2.28 


Split Linear Fit, MNWNSZ= 2 
WNSZ(s)= 26, 61, 91 


3.8 
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advanced smoothers maintain the shape of the raw data better than the 
other smoothers when larger neighborhood sizes are used in the 
smoothing. Table 18 illustrates that the advanced smoothers maintain a 
constant CPU usage, throughout the evaluation. Tables 8 through 18 
have larger CPU usage than in the previous chapter. The reason for 
this is that the data sizes are different. 

D. CONCLUSIONS 

The two advanced smoothers investigated in this thesis, the 
Supersmoother and the Split Linear Fit, generate adequate smooth 
curves. They are faster than most current smoothing techniques. 
However, their many inputs make their implementation difficult. 

Two simpler smoothers are the LOWESS and the Cosine-Weighted 
Moving Average. Both only require a single neighborhood size as 
input. This dramatically reduces the complexity of the program for the 
user. Both generate smooth curves with satisfactory results equal to 

the advanced smoothers. However, both LOWESS and the 

Cosine-Weighted Moving Average produce better sum of squared resi- 
duals values. In addition, the Cosine-Weighted Moving Average is much 
faster than either of the advanced smoothers. 

The simpler smoothers, LOWESS and the Cosine-Weighted Moving 
Average, do have some drawbacks. LOWESS is considerably slow than 
the advanced smoothers, but the disadvantage of LOWESS is only 
apparent after many runs of the programs. The speed difference for a 
single run is minor, measured only in seconds. The disadvantage of 
the Cosine-Weighted Moving Average smoother is that values are 
dropped from the ends of the output array, as illustrated in this 
thesis. The larger the neighborhood size, the more smoothed values 
are dropped, sometimes these values are important and other times they 
are not; this decision belongs to the user. 

It is the recommendation of the author that LOWESS or the 
Cosin-Weighted Moving Average smoother be used over either of the 
advanced smoothers . The advanced smoothers are considerably more 
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SPLINE FIT. PARAMETERS (0,450) 



SMOOTHING WITH EQUAL- WEIGHT MOVING AVERAGE. M- 91 






SMOOTHING WITH LOWESS. F- 0.13562 
R08UST. OU AO RATIO FITTING 




SMOOTHING WITH SUPERSMOOTHER. ALPHA- 0.0 
SPAN(S)= 0.04322. 0.13562. 0.2489 




SMOOTHING WITH SPUT UNEAR FIT 
WNSZ(S)= 29. 91. 167; MNWNSZ= 10 




Figure 5.8 Comparison of Smoothers 
Using Neighborhood Size of 91. 
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TABLE 17 




SUM OF SQUARED RESIDUALS USING 

91 


NEIGHBORHOOD SIZE OF 


Type of Fit 


Sum of Squared 
Residuals 


Spline (0,450) 


450.54580 


Equal- Weight Moving Average, M= 91 


550.83383 


Cosine-Weighted Moving Average, M= 91 463.34429 


LOWESS, robust, F= 0.13562 


458.317 


LOWESS, non-robust, F= 0.13562 


454.213 


Supersmoother, ALPHA= 0.0 

SPAN(s)= 6.04322, 0.13562, 0.2489 


435.71549 


Split Linear Fit. MNWNSZ= 2 
WNSZ(s)= 26, 91, 167 


452.55936 



TABLE 18 




CPU: USAGE NEIGHBORHOOD 


SIZE OF 91 


Type of Fit 


CPU Consumed 
(in Seconds) 


Spline (0,450) 


55.27 


Equal- Weight Moving Average, M= 91 


0.2 


Cosine-Weighted Moving Average, M= 91 


1.1 


LOWESS, robust, F= 0.13562 


68.04 


LOWESS, non-robust, F= 0.13562 


23.01 


Supersmoother, ALPHA= 0.0 

SPAN(s)= 6.04322, 0.13562, 0.2489 


2.27 


Split Linear Fit. MNWNSZ= 2 
WNSZ(s)= 26, 91, 167 


3.82 
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complex than these smoothers without yielding better accuracy. As 
mentioned before, the speed difference is minor, measured only in 
seconds. 



115 



VI. INSTRUCTIONS QN USING THE ADVANCED SM OOTHERS 



A. GENERAL 

This chapter provides detailed instructions on how to use the 
smoothing programs developed by the author of this thesis and have the 
advanced smoothing algorithms embedded in them. The Supersmoother 
algorithm is embedded in one smoothing program and the Split Linear 
Fit algorithm is embedded in another smoothing program. Any inter- 
ested person should be familiar with this chapter before attempting to 
do any data smoothing with these programs. In order to obtain good 
and fast results and understand the smoothing algorithms, it is highly 
recommended that the user read either or both of Chapters II and III, 
depending on the program to be used. Before adjusting any embedded 
parameters, it is essential that the user read the 'Technical Description' 
chapter corresponding to the program being modified. These programs 
are designed to be used on the IBM 3033 computer currently at the 
Naval Postgraduate School. The programs are written in FORTRAN 77, 
because of the need to use negative index values. The use of both of 
the smoothing programs is very similar, so both are addressed in this 
chapter. Operations peculiar to each program are addressed as sepa- 
rate paragraphs corresponding to each program. 

The smoothing programs are completely interactive, in other words, 
the user enters the data and other pre-defined parameters when asked 
by built-in queries. The user has the option of selecting the one of 
several types of output, which are: 

1. create a CiMS file and place the smooth output into this newly 
created CMS file; 

2 . place the smooth output into an existing APL workspace within 
a newly created APL variable; 

3. create an APL workspace and place the smooth output into this 
newly created workspace, or; 

4. plot the smooth output using the GRFASTAT graphics package. 
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The smoothing programs are quite flexible in letting the user decide 
where to put the final smooth output. The user can smooth any 
number of points up to 5000 data points. The data to be smoothed 
need not be in order, because the smoothing programs have an 
embedded sorting subroutine which sorts the data into chronological 
according to the abscissa values entered by the user, before sending 
the data array to the advanced smoothing algorithms. 

The programs are written in such a way that if the user makes a 
mistake, it will be announced, and the program can be restarted or 
stopped. Practically no knowledge of FORTRAN is needed to run these 
programs. If APL is to be used, it is best to understand what the 
relevant 'workspace' and 'variable' names are [Ref. 15] . If the user 
wants to use GRAFSTAT to plot the output, it is best to get familiar 
with the GRAFSTAT 'PLOT' and 'AXIS CONTROL' functions before 
attempting to use the plotting option embeded within the advanced 
smoothing programs. 

B. TERMINAL REQUIREMENTS 

If used to create a CMS data file the Supersmoothing program is 
used to can be run from any remote terminal attached to the IBM 3033 
and located within the Naval Postgraduate School. If this smoothing 
program is to be used for APL workspace creation, then an appropriate 
APL terminal must be used. If the GRAFSTAT plotting option is to be 
used, then the IBM 3277/TEK 618 graphics terminals must be used. 

Because the Split Linear Fit generates a great amount of data it 
must be run on the IBM 3277/TEK 618 graphics terminals with a memory 
capacity of at least 2 Mega-Bytes. The bigger the input data set, the 
more data storage that the Split Linear Fit smoothing program will need. 
For each point that is to be smoothed, the computer needs the capacity 
to store a matrix that has the dimensions of 9 by the number window 
sizes entered. For example, if 200 data points are to be smoothed with 
the Split Linear Fit smoothing program and 6 window sizes are entered, 
then each data point will need a matrix of size 9 by 6. Therefore, to 
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run this smoothing program, the user must have available the storage 
capacity for a matrix that is 200 by 9 by 6, i.e. 10,800 bytes, plus the 
storage capacity for the raw data, the corresponding abscissa and 
computed weights, the temporary smoothed point values, and the final 
smoothed point values. 

C. INPUT DATA FILES 

In order to use either of the smoothing programs, it is required 
that the data which is to be smoothed be in a CMS file with filetype 
'data'. If the data points are not in chronological order, the corre- 
sponding abscissa, i.e. numerical order, must be in another CMS file 
with filetype 'ORD'. 

D. PROGRAM INITIALIZATION 

The advanced smoothing program packages can be obtained from 
Professor P. A. W. Lewis, Department of Operations Research, U. S. 
Naval Postgraduate School, Monterey, CA. The Supersmoothing 
program consists of the following files: 

1. SUPSMO EXEC Al; 

2. SUPSMO FORTRAN Al; 

3. SUPSMO VSAPLWS Al. 

A copy of these files is in Appendix A. The Split Linear Fit program 
consists of the following files: 

1. SPTLIN EXEC Al; 

2. SPTLIN FORTRAN Al; 

3. SPTLIN VSAPLWS Al. 

A copy of each of these files is in Appendix B. It is essential that the 
three respective files be on the same disk when either smoothing 
program is to used. The EXEC file do the following operations: 

1. activates the IBM; 

2. system libraries; 

3. queries the user for the input; 

4. designates the computer storage space to be used for input and 
output; 
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5. loads and runs the FORTRAN file; 

6. executes the APL or GRAFSTAT user options; 

7. returns the disk to the original state, i.e. erases the TEXT 
and LOAD files, so as not to overload the disk being used. 

The Supersmoother smoothing program is invoked by typing 
'SUPSMO' and then pressing the ENTER key on the keyboard. Next 
the user must read the information displayed and comply with the 
instructions. As long as the user follows the instructions, the 
smoothing program 'SUPSMO 1 will produce the desired results. If any 
deviations from the requested data occur 'SUPSMO' will let the user 
know. The Split Linear Fit smoothing program is just as easy. It is 
invoked by typing 'SPTLIN' and pressing the ENTER key on the 
keyboard. Read the information on the screen, answer the questions, 
and 'SPTLIN' does the rest. 

An example of a session using SUPSMO to create the Supersmoother 
curve in Figure 5.3 is in Appendix E. A session using the Split Linear 
Fit smoothing program SPTLIN basically follows the same line of 
questions. 

E. OUTPUT FILES 

The smoothing programs will put the smooth output where the user 
designates unless a file already exits with that name. If a CMS file 

already exists by the name that the user wants, then the session will 

be terminated, told the reason for the termination and to restart the 
program. If the CMS file does not exist then the program continues 
normally. For APL files, the program queries the user about the status 
of the file, i.e. exiting or to be created. If the new file exists or if 

old file does not exist, then the session is terminated, the user is told 

the reason for the termination and to restart. One word of caution: 
THE SMOOTHING PROGRAMS WILL NOT WRITE OVER AN EXISTING 
APL VARIABLE!!! The program will continue running normally, but 
the data will be lost. Therefore, it is up to the user to manage the 
disk space properly and to keep note of which file contains what type 
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of smooth data. The CMS file created by the smoothing programs will 
contain five smooth data points per row. The length will depend on the 
number of points that are smoothed, i.e. the number of points smoothed 
divided by five. 

In order to put data into an existing APL workspace or create a 
new workspace, the user types in the name of the APL workspace when 
asked to do so. The will verify the status of the workspace as mention 
before. If everything is satisfactory, the program continues. 

Should the user have any specific questions about the programs it 
is recommended that the 'TECHNICAL DESCRIPTION' chapter be read. 
These chapters basically follow the smoothing procedure step by step. 
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APPENDIX A 

SUPERSMOOTHER PROGRAM 



1. SUPSMO EXEC 



The following computer file is SUPSMO EXEC which activates and 
runs the Super smoother smoothing program. Chapter VI contains 
instructions on how to use this smoothing program. 



VLNKMLIB VALTLIB VFORTLIB IMSLSP NONIMSL 



ScTRACE 
SET BLIP * 

GLOBAL TXTLIB 
CMSLIB 
CLRSCRN 

STYPE YOU HAVE INITIATED AN ALGORITHM 
STYPE TO SMOOTH A SET OF DATA USING THE 
STYPE ALGORITHM " SUPER SMOOTHER r ' 

STYPE DEVELOPED BY FRIEDMAN AND STUETZLE OF 
STYPE STANFORD UNIVERSITY DEPT. OF STATISTICS 
STYPE 

&TYPE IF GRAPHICS WILL NOT BE USED DEFINE STORAGE AS 1024 K 
&TYPE BY ENTERING ' DEF STOR 1024 #' 

STYPE FOLLOWED BY ' I CMS ' , 

&TYPE THEN BY ' SUPSMO ' 

STYPE 

&IYPE DO YOU WISH TO CONTINUE? 

&TYPE ENTER Y FOR YES OR ANY OTHER KEY TO EXIT: 

SURE AD VAR SCONT 
CLRSCRN 

&IF SCONT NE Y SGOTO -EXIT 

STYPE IN ORDER TO USE THIS ALGORITHM 

STYPE YOU MUST HAVE ON HAND THE FOLLOWING: 

&TYPE 

FILENAME OF DATA FILE ( FILETYPE DATA ) WITH 
DATA TO BE SMOOTHED 



ScTYPE 1 
STYPE 
STYPE 
&TYPE 2 
STYPE 
ScTYPE 
ScTYPE 
ScTYPE 
ScTYPE 3 
STYPE 
ScTYPE 
ScTYPE 
STYPE 
STYPE 4 
ScTYPE 
STYPE 
ScTYPE 
ScTYPE 5 
ScTYPE 
ScTYPE 
USED. 

STYPE 

STYPE DO YOU WISH TO CONTINUE? 

ScTYPE ENTER Y FOR YES OR ANY OTHER KEY TO EXIT: 
ScREAD VAR SCO NT 
CLRSCRN 

STF ScCONT NE Y ScGOTO -EXIT 



IF DATA POINTS ARE NOT IN CHRONOLOGICAL ORDER , 

YOU NEED TO HAVE A FILE ( FILETYPE ORDER ) 

WITH INDICES CORRESPONDING TO DATA POINTS 
INDICATING THE ORDER OF THE DATA POINTS. 

FILENAME OF DATA FILE WHERE SMOOTHED OUTPUT 
WILL BE WRITTEN OR IF YOU WANT 

TO WRITE OUTPUT INTO APL HAVE ON HAND VARIABLE 
AND WORKSPACE NAMES THAT WILL STORE THE OUTPUT. 

IF YOU WANT TO SMOOTH THE DATA USING ONLY 

ONE WINDOW SIZE , HAVE ON HAND 

THE DECIMAL FRACTION OF THE DATA TO BE USED. 

IF YOU WANT TO SMOOTH THE DATA USING 
THREE WINDOW SIZES, HAVE ON HAND 

THE THREE DECIMAL FRACTIONS OF THE DATA TO BE 
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STYPE ENTER FILENAME OF FILE WHICH CONTAINS 
STYPE THE DATA TO BE SMOOTHED : 

SREAD ARGS 

STF SN = 0 SGOTO -TELL 

&IF &N > 1 &GOTO -TELL 

STATE &1 DATA A 1 

&IF SRC NE 0 SGOTO -ERROR 

CLRSCRN 

-ORDR CLRSCRN 

STYPE ARE DATA POINTS TO BE SMOOTHED 
STYPE IN CHRONOLOGICAL ORDER? 

STYPE ENTER Y FOR YES OR N FOR NO: 

&READ VAR &CONT 

&IF &CONT EQ Y SGOTO -GO 

STYPE SINCE THE DATA POINTS ARE NOT 

STYPE IN CHRONOLOGICAL ORDER, THEREFORE 

STYPE ENTER FILENAME OF FILE ( FILETYPE ORDER ) 

&TYPE THAT CONTAINS ORDER INDICES 

&READ VAR SORD 

STATE SORD ORDER A 1 

SIF SRC NE 0 SGOTO -ERROR 

-GO CLRSCRN 

STYPE THE DATA YOU WANT TO SMOOTH IS IN &1 DATA 
STYPE WHERE DO YOU WANT TO WRITE THE SMOOTHED OUTPUT? 

&TYPE CMS OR APL? 

- STRT STYPE YOU CAN PLOT THE SMOOTHED OUTPUT 
&TYPE IF YOU ARE LOGGED ON A TERMINAL 

STY PE THAT CAN ACCESS GRAF ST AT , I.E. HAVE 2 M OF STORAGE 
STYPE BUT THE OUTPUT MUST BE STORED IN AN APL VARIABLE 
STYPE ENTER APL OR CMS: 

SREAD VAR SPLA 

SIF SPLA EQ APL SGOTO -API 

STYPE 

CLRSC RN 

STYPE THE SMOOTHED OUTPUT WILL BE WRITTEN 
STYPE TO A CMS FILE ( FILETYPE DATA) 

STYPE ENTER ONLY THE FILENAME YOU WANT 
STYPE TO USE FOR THAT CMS FILE: 

SREAD VAR SFN 

STYPE THE SMOOTHED OUTPUT WILL BE WRITTEN 
STYPE INTO THE CMS FILE SFN DATA 
SGOTO -COM 
-API STYPE 

STYPE NOT USING THE NAME OF THE FILE 
STYPE WITH THE INPUT DATA. SI 
STYPE ENTER THE NAME OF THE APL VARIABLE 
STYPE THAT WILL STORE THE OUTPUT: 

SREAD VAR SA 

STYPE DO YOU WANT TO PLOT THE OUTPUT? 

STYPE ENTER Y FOR YES OR N FOR NO: 

SREAD VAR SGRF 

SIF SGRF EQ Y SGOTO -PLOT 

STYPE ENTER THE NAME OF THE APL WORKSPACE 
STYPE THAT WILL CONTAIN SA : 

SREAD VAR SWKS 

STYPE IS SWKS AN EXISTING WORKSPACE OR A NEW WORKSPACE? 
STYPE ENTER 0 FOR EXISTING OR N FOR NEW: 

SREAD VAR SAGE 
SFN = TE 
SGOTO -COM 

- PLOT STYPE CAN YOU ACCESS 2 M OF STORAGE 
STYPE ON THIS DISK {TERMINAL)? 

STYPE ENTER Y FOR YES OR N FOR NO: 

SREAD VAR SGRF 

SIF SGRF EQ N SGOTO -STRT 

SFN = TE 

_ COM CLRSCRN 

STYPE PLEASE READ THE FOLLOWING INSTRUCTIONS VERY CAREFULLY 
STYPE ARE YOU READY TO START THE SUPER SMOOTHING PROGRAM? 
STYPE ENTER Y FOR YES OR ANY OTHER KEY TO EXIT: 
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GREAD VAR GO 
CLRSCRN 

&IF GO NE Y &GOTO -EXIT 

&TYPE PLEASE WAIT THE SMOOTHING PROGRAM IS BEING COMPILED 

FI 04 CLEAR 

FI 05 CLEAR 

FI 06 CLEAR 

FI 0 7 CLEAR 

FI 08 CLEAR 

FI 09 CLEAR 

FORTVS SUPSMO (LVL ( 77 )) 

FIL 04 DISK &ORD ORDER 
FIL 07 DISK G1 DATA 

FIL 08 DISK GFN DATA ( RECFM FBA LRECL 80 BLKSIZE 800 ) 
CLRSCRN 

GTYPE PLEASE WAIT SMOOTHING PROGRAM IS BEING LOADED 

LOAD SUPSMO (START 

CLRSCRN 

ERASE SUPSMO LISTING 

ERASE SUPSMO TEXT 

ERASE LOAD MAP 

&IF &PLA EQ CMS &GOTO -EX 

&IF GGRF EQ N G GOTO - NGRF 

CP TERMINAL APL ON 

GSTACK )LOAD SUPSMO 

GSTACK GA +CMSREAD 

&STACK GFN 

&STACK DATA 

&STACK N 

&STACK GA .GA 

GSTACK G1 +CMSREAD 

&STACK G1 

GSTACK DATA 

&STACK N 

GSTACK &1 4 - ,G1 

&STACK )SAVE 

&TYPE ****PLEASE WAIT , LINKING TO GRAFSTAT* ****** ********** 
&STACK )LOAD GRAFSTAT 
&STACK DUM <rCMS ' CLRSCRN ' 

&STACK 1PV0PY~SUPSM0 
& STACK ST RT 
EXEC APLGS~ 

&.GOTO -DRP 

-NGRF CP TERMINAL APL ON 

&STACK )LOAD SUPSMO 

&STACK &A <rCMSREAD 

&STACK &FN 

GST AC K DATA 

GST AC K N 

GST AC K GA *■ ,GA 

GSTACK )SA VE 

GST AC K ) CLEAR 

GIF GAGE EQ 0 GSTACK )LOAD GMKS 
GIF GAGE EQ N GSTACK )WSID GNKS 
GSTACK )PCOPY SUPSMO GA 
GSTACK )SAVE 
GSTACK )OFF HOLD 
EXEC APL 

-DRP ERASE GFN DATA * 

CP TERMINAL APL ON 
GSTACK }LOAD SUPSMO 
GSTACK ) ERASE GA 
GSTACK }ERASE G1 
GSTACK )SAVE 
GSTACK )OFF HOLD 
EXEC APL 

-EX GTYPE YOU HAVE FINISHED 
GEXIT 1000 

- TELL GTYPE YOU HAVE ENTERED TOO MANY OR 
GTYPE NOT ENOUGH ENTRIES ABOUT DATA FILE 
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STYPE YOU NEED TO BEGIN AGAIN BY ENTERING 
STYPE 

&TYPE SUPSMO 
SEXIT 100 
SGOTO -EX 

-ERROR STYPE ABOVE ENTERED FILE DATA 
STYPE DOES NOT EXIST ON YOUR A- DISK 
&TYPE CHECK YOUR FLIST AND 

&TYPE THEN BEGIN AGAIN BY ENTERING 

&TYPE 

&TYPE SUPSMO 

SEXIT 101 

- EXIT &TYPE YOU HAVE FORCED AN EXIT ON THIS SMOOTHING EXEC 

SHY PE IF YOU WISH TO BEGIN AGAIN ENTER 

STYPE 

STYPE SUPSMO 

SEXIT 102 



2. SUPSMO FORTRAN 



The following file is SUPSMO FORTRAN which does the actual 
smoothing of a data set. The subroutines SUPSMU and SMOOTH of the 
following FORTRAN program were developed by Friedman and Stutzle 
[Ref. 9] as stated in Chapter I. 

C 

C READ SUPSMO EXEC FILE BEFORE USING THIS FILE. 

C 

Q*A* ********************************************************** x************** ****** 

C THIS PROGRAM READS THE INPUT DATA, Y(N) VARIABLES FROM THE FILE 
C WATER DATA A1 AND THEN USES THE INTERNAL SUPER SMOOTHING SUBROUT.* 

C IN ORDER TO SMOOTH THE INPUT DATA. 

C THE SPANS CAN BE CHANGED BY ENTERING DESIRED SPANS ON THE VERY 
C LAST LINE OF THIS FILE. 

0 * ******** *************************************************** ********************* 

C 

C 

C INPUT: 

C N : NUMBER OF OBSERVATIONS (X,Y - PAIRS) 

C X(N) : ORDERED ABSCISSA VALUES 
C Y(N) : CORRESPONDING ORDINATE (RESPONSE) VALUES 
C W(N) : WEIGHT FOR EACH (X,Y) OBSERVATION 
C IPER: PERIODIC VARIABLE FLAG 
C IPER — 1 : X IS ORDERED INTERVAL VARIABLE 

C IPER — 2 : X IS A PERIODIC VARIABLE WITH VALUES 

C IN THE RANGE (0.0, 1.0) AND PERIOD 1.0 

C SPAN : SMOOTHER SPAN (FRACTION OF OBSERVATIONS IN WINDOW). 

C SPAN = 0.0 : AUTOMATIC (VARIABLE) SPAN SELECTION 

C ALPHA: CONTROLS HIGH FREQUENCY (SMALL SPAN) PENALTY 
C USED WITH AUTOMATIC SPAN SELECTION (BASE TONE CONTROL) 

C (ALPHA. LE.0,0 OR ALPHA. GT.10.0 : NO EFFECT) 

C OUTPUT: 

C SMO(N): SMOOTHED ORDINATE (RESPONSE) VALUES 
C SCRATCH" 

C SC(N, 7): INTERNAL WORKING STORAGE 

C NOTE: 

C FOR SMALL SAMPLES (N < 40) OR IF THERE ARE SUBSTANTIAL SERIAL 
C CORRELATIONS BETWEEN OBSERVATIONS CLOSE IN X - VALUE, THEN 
C A PRESPECIFIED FIXED SPAN SMOOTHER (SPAN > 0) SHOULD BE 
C USED. REASONABLE SPAN VALUES ARE 0.3 TO 0.5. 

C 

C 

REAL*4 Y(5000),X(5000),SMO(5000),W(5000), SPAN, ALPHA, SC(5000, 7) 

REAL*4 ACVR(5000),TPANS(3) 

INTEGER IR(5000),K,N, IPER, WEI, ODR 
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DOUBLE PRECISION WT,FBO,FBW,XM,YM f TMP,VAR,CVAR,A,H,SY 
COMMON /CONSTS/ BIG,SML,EPS 
WRITE(5,1 ) 

1 FORMATf IX, 'ENTER THE NUMBER OF DATA POINTS TO BE SMOOTHED— 
‘INTEGER VALUE') 

READ(6,‘)N 

18 D019I-1.N 

19 W(l)-1. 

9 WRITE(5,12) 

12 FORMAT(1X/ARE THE INPUT DATA POINTS IN CHRONOLOGICAL 
‘ORDER?', /,1X, 'ENTER 0 FOR NO OR 1 FOR YES') 

READ(6,‘)ODR 
IF(0DR.E0.1)G0 TO 13 
READ(4,*)(X(I),I-1,N) 

GO TO 14 

13 DO 15 I = 1,N 

15 X(l) — FLOAT(I) 

14 CALL FRTCMS('CLRSCRN ') 

WRITE(5,5) 

5 FORMAT(1X, 'ENTER 1.0 IF YOU DESIRE TO USE ONLY ONE SPAN VALUE', 

‘//ENTER 0.0 IF YOU WANT TO USE THREE SPAN VALUES') 

READ(6,*)SPAN 

CALL FRTCMS('CLRSCRN ') 

IF(SPAN.EO.1.0)THEN 

WRITE(5,8)N 

8 FORMAT(1X,'ENTER THE SPAN VALUE TO BE USED',/, IX, 'FRACTION OF',15, 

‘'I.E. A REAL NUMBER BETWEEN 0.0 AND 1.0') 

READ(6,‘)SPAN 
ALPHA - 0.0 
ELSE 

WRITE(5,2)N 

2 FORMAT(1X, 'ENTER THE LOWEST SPAN VALUE:', /,1X,'FRACTION OF', 

‘15/ I.E. A REAL NUMBER BETWEEN 0.0 AND 1.0') 

READ(6,‘)TPANS(1 ) 

WRITE(5,3)N 

3 FORMAT(1X,'ENTER THE MIDDLE SPAN VALUE:',/, IX/ FRACTION OF' 

M5/I.E. A REAL NUMBER BETWEEN 0.0 AND 1.0') 

READ(6,‘)TPANS(2) 

WRITE(5,4)N 

4 FORMAT(1X,'ENTER THE HIGHEST SPAN VALUE:',/, IX, 'FRACTION OF' 

*,15/ I.E. A REAL NUMBER BETWEEN 0.0 AND 1.0') 

READ(6,*)TPANS(3) 

CALL FRTCMS('CLRSCRN ') 

11 WRITE(5,16) 

16 FORMAT(1X,'IF ONE OF THE SPAN VALUES IS SMALL',/, 

*'I.E. RESULTS IN A SMALL WINDOW SIZE (10 OR LESS)',/, 

‘'YOU MAY WISH TO ADJUST THE SMOOTH CURVE ROBUSTNESS',/, 

*'BY ENTERING A REAL NUMBER GT 0.0 BUT LT 10.0',/, 

‘'FOR NO ROBUST ADJUSTMENT ENTER 0.0',/, 

*'OR COMPLETE ROBUST ADJUSTMENT ENTER 10.0',/, 

‘'ENTER YOUR CHOICE') 

READ(6,‘)ALPHA 

CALL FRTCMS('CLRSCRN ') 

ENDIF 

WRITE(6,20) 

20 FORMAT(1X,'““PLEASE WAIT SMOOTHING PROGRAM NOW RUNNING*****') 
READ(7,‘)(Y(I),I«1,N) 

IF(ODR.EQ.1)GO TO 17 
CALL SORTER(X,W,Y,N) 

17 IPER-1 
IF(X(N).EO.1.0)IPER«2 

7 CALL SUPSMU(N,X,Y,W,IPER, SPAN, ALPHA, SMO, SC, TPANS) 

WRITE(8,10)(SMO(I),I = 1 , N) 

1 0 FORMAT(2X,5(F1 2.6.2X)) 

STOP 

END 

A* ******** AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAkAAAAAAAA *************** 



AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

SUBROUTINE SUPSMU(N,X,Y,W,IPER,SPAN,ALPHA, SMO, SC, TPANS) 



125 



ooo 



DIMENSION X(U)y(U)MU),ShAO(U) t SC(U,7)JPAUS(3) 

COMMON /CONSTS/ BIG,SML,EPS 
IF (X(N).GT.X(I ))GO TO 30 
SY = 0.0 
SW = SY 
DO 10 J = 1,N 

SY = SY + W(J)*Y(J) 

SW = SW + W(J) 

10 CONTINUE 

A = SY/SW 
DO 20 J = 1,N 
SMO(J) = A 
20 CONTINUE 

RETURN 
30 I = N/4 

j =3 3*1 

SCALE — X(J)-X(I) 

40 IF(SCALE.GT.0.0)GO TO 50 

IF(J.LT.N)J = J + 1 
IF(LGT. 1 )l — 1-1 
SCALE = X(J)-X(I) 

GO TO 40 

50 VSMLSO - (EPS*SCALE)**2 
JPER = IPER 

IF(IPER.EO.2.AND.(X(1).LT.0.0.OR.X(N).GT.1.0))JPER = 1 
IF(JPER.LT.1.0R.JPER.GT,2)JPER = 1 
IF(SPAN.LE.0.0)GO TO 60 

CALL SMOOTH(N,X,Y,W, SPAN, JPER, VSMLSO, SMO, SC) 

RETURN 

60 DO 70 I -1,3 

CALL SMOOTH(N,X,Y,W,TPANS(l), JPER, VSMLSO, SC(1, 2*1-1 ),SC(1, 7)) 

CALL SMOOTH(N,X,SC(1, 7), W,TPANS(2),^JPER, VSMLSO, SC(1, 2*1), H) 

70 CONTINUE 

DO 90 J = 1,N 
RESMIN = BIG 
DO 80 1 = 1,3 

IF(SC(J,2*l).GE.RESMIN)GO TO 80 
RESMIN = S(J,2*I) 

SC(J,7) = TPANS(I) 

80 CONTINUE 

IF(ALPHA.GT.0.0.AND.ALPHA.LE.10.0.AND.RESMIN.LT.SC(J,6))SC(J,7) 

* = SC(J,7) + (TPANS(3)-SC(J,7))*AMAX1(SML 1 RESMIN/SC(J,6))**(10.0- 

* ALPHA) 

90 CONTINUE 

CALL SMOOTH(N,X,SC(1,7),W,TPANS(2),-JPER,VSMLSQ,SC(1 ,2),H) 

DO 110' J = 1,N 

IF(SC(J,2).LE.TPANS(1))SC(J,2) = TPANS(1) 
IF(SC(J,2).GE.TPANS(3))SC(J,2) = TPANS(3) 

F = SC(J,2)-TPANS(2) 

IF(F.GE.0.0)GO TO 100 
F = -F/(TPANS(2)-TPANS(1)) 

SC(J,4) = (1 .0-F)*SC(J,3) + F*SC(J,1) 

GO TO 110 

100 F = F/(TPANS(3)-TPANS(2)) 

SC(J,4) = (1 .0-F)*SC(J,3) + F*SC(J,5) 

110 CONTINUE 

CALL SMOOTH(N,X f SC(1, 4), W,TPANS(1 ),-JPER; VSMLSO, SMO.H) 

RETURN 

END 



SUBROUTINE SMOOTH(N,X,Y,W ( SPAN,IPER, VSMLSO, SMO.ACVR) 
DIMENSION X(N),Y(N),W(N),SMO(N),ACVR(N) 

INTEGER IN, OUT 

DOUBLE PRECISION WT,FBO,FBW,XM,YM,TMP,VAR,CVAR,A,H,SY 

XM =0.0 

YM=XM 

VAR =YM 

CVAR = VAR 
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FBW-CVAR 
JPER *= lABS(IPER) 

IBW = 0.5*SPAN*N + 0.5 
IF(IBW.LT.2)IBW*2 
IT = 2*IBW+1 
DO 20 l-IJT 
J = l 

IF(JPER.E0.2)J — l-IBW-1 
XTI-X(J) 

IF(J.GE.1)GO TO 10 
J = N + J 
XTI«X(J)-1.0 
10 WT-W(J) 

FBO«FBW 

FBW«FBW + WT 

XM - (FBO*XM + WT*XTI)/FBW 

YM = (FBO*YM + WT*Y(J))/FBW 

TMP = 0.0 

IF(FBO.GT.O.O)TM P - FBW‘WP(XTI-XM)/FBO 
VAR « VAR + TMP*(XTI-XM) 

CVAR - CVAR + TM P*(Y(J)-YM) 

20 CONTINUE 

DO 70 J-1.N 

OUT = J-IBW-1 
IN = J + IBW 

IF((JPER.NE.2).AND.(OUT.LT.1.0R.IN.GT.N))GO TO 60 

IF(OUT.GE.1 )GO TO 30 

OUT= N + OUT 

XTO =» X(OUT)-1 .0 

XTI-X(IN) 

GO TO 50 

30 IF(IN.LE.N)GO TO 40 

IN “ IN-N 
XTI “X(IN) + 1.0 
XTO-X(OUT) 

GO TO 50 

40 XTO = X(OUT) 

XTI =X(IN) 

50 WT =» W(OUT) 

FBO = FBW 
FBW-FBW-WT 
TMP = 0.0 

IF(FBW.GT.0.0)TMP=* FBO*WT*(XTO-XM)/FBW 
VAR = VA R-TM P* (XTO-XM ) 

CVAR « CVAR-TM P*(Y(OUT)-YM) 

XM - (FBO*XM-WT*XTO)/FBW 
YM = (FBO*YM-WTY(OUT))/FBW 
WT=W(IN) 

FBO-FBW 

FBW= FBW + WT 

XM - (FBO*XM + WT*XTI)/FBW 

YM = (FBO*YM + WT*Y(IN))/FBW 

TMP = 0.0 

IF(FBO.GT.O.O)TMP = FBW*WT*(XTI-XM)/FBO 
VAR - VAR + TM P*(XTI-XM) 

CVAR - CVAR + TMP*(Y(IN)-YM) 

60 A = 0.0 

I F( VA R. GT. VS M LSO)A - CVAR/VAR 
SMO(J) = A*(X(J)-XM) + YM 
IF(IPER.LE.0)GO TO 70 
H = 1.0/FBW 

I F(VA R. GT.VS M LSO) H - H + (X(J)-XM)**2/VAR 
ACVR(J) =* ABS(Y(J)-SMO(J))/(1.0-W(J)*H) 

70 CONTINUE 

J = 1 

80 JO = J 

SY = SMO(J)*W(J) 

FBW-W(J) 

IF(J.GE.N)GO TO 100 
90 IF(X(J + 1 ).GT.X(J))GO TO 100 
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J-J+1 

SY = SY + W(J)*SMO(J) 

FBW — FB W + W(J) 

IF(J.LT.N)GO TO 90 
100 IF(J.LE.JO)GO TO 120 
SY — SY/FBW 
DO 110 l-JO.J 
SMO(l)»SY 
110 CONTINUE 
120 J-J+1 

IF(J.LE.N)GO TO 80 

RETURN 

END 

.* * A** * * ** ******** ************************************ ************** 



5 



1 



2 

3 



4 



;******************************* A*** ******************* ********* 



SUBROUTINE SORTER(X,W,Y,N) 
REALM X(N),W(N),Y(N),D(5000) 
INTEGER N,KEY(5000) 

DOS 1-1, N 
KEY(I)-I 

CALL SHSORT(X,KEY,N) 
DO 1 1-1, N 
D(l) = W(l) 

DO 2 1-1, N 



J-KEY(I) 
W(I) = D(J) 

DO 3 1-1, N 
D(l)-Y(l) 

DO 4 I — 1 , N 
J-KEY(I) 
Y(I)-D(J) 
RETURN 
END 



******************************************************************* 
BLOCK DATA 



THIS SETS THE COMPILE TIME (DEFAULT) VALUES FOR VARIOUS 
INTERNAL PARAMETERS: 

BIG : A LARGE REPRESENTATIVE FLOATING POINT NUMBER 
SMALL : A SMALL NUMBER. SHOULD BE SET SO THAT (SML)**(10.0) 

DOES NOT CAUSE FLOATING POINT UNDERFLOW 
EPS : USED TO NUMERICALLY STABILIZE SLOPE CALCULATIONS FOR 
RUNNING LINEAR FITS 

THESE PARAMETER VALUES CAN BE CHANGED BY DECLARING THE RELEVANT 
LABELED COMMON IN THE MAIN PROGRAM AND RESETTING THEM WITH 
EXECUTABLE STATEMENTS. 



COMMON /CONSTS/ BIG,SML,EPS 
DATA BIG,SML,EPS /1 .0E20.1 .OE-7,1 .OE-3/ 

END 

3. SUPSMO YSAPLWS 

The following two APL functions are used in conjuction with the two 
files listed above. They were developed by the author of this thesis, 
and are the main APL functions within the APL workspace SUPSMO. 
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The first APL function links the user of the smoothing program with 
GRAFSTAT and gives a user familiar with GRAFSTAT the opportunity 
to proceed into GRAFSTAT where a greater variety of graphic functions 
are available. 



FILE TEAT YOU WANTED 
SO THAT YOU MAY BE 



SMOOTHED HAS 
ABLE TO PLOT 



' THE SMOOTHED AND UNSMOOTHED DATA . ' 

' THE UNSMOOTHED DATA IS IN THE VARIABLE WITH THE SAME 



V ST RT;C 

[1] DUM<-CMS ' CLRSCRN ' 

[2] ~ r THE ENTIRE DATA 
BEEN TRANSFERRED" 1- 

[3] 1 TO THIS WORKSPACE 
BOTH' 

[4] 

[5] 

NAME AS' 

‘6] 'THE DATA FILE THAT YOU HAVE YOUR INPUT DATA IN.' 
[7] 1 1 

”8] • • 

9] 'DO YOU WISH TO GO INTO APL OR CONTINUE?' 

10] 'ENTER 0 FOR APL OR 1 FOR CONTINUE' 

11] C-e-Q 

12 ->1 3 +Cx 8 

“13] 'YOU WILL BE SENT TO APL AFTER YOU HAVE READ 
IMPORTANT TEXT. 

[14] 



THIS 



• ***AFTER YOU HAVE FINISHED WORKING IN APL AND WISH TO 
PLOT THE DATA ' 

[15] 

] 16] 

[177 



[18 

[19 

[20 

[21 

V 



'ENTER pL0TER*******N0TICE THAT PLOTER HAS ONLY ONE T' 

i t 
i t 

'NOW ENTER 0 AGAIN ' 

C-e-0 

->c 

PLOTER 



The next APL function creates the APL variables to be used in the 
GRAFSTAT 'PLOT 1 screen. This plotting option is made available to the 
user through the above APL function. The user can use this APL 
function to do the plotting or use the GRAFSTAT graphics functions. 
A user need not fully understand how to use the GRAFSTAT plot 
screen in order to use this function. Several examples are shown with 
each requested entry so that the user can see what the entry should 
look like. 



V 

PLOTER ; ; CO ; DUM;P; SY ; TI ; TL ; TP ;XL',X0 ; XS ; XT;XY ; XV ; Y L ; YO Y S ; YT ; YV ; Y Y 

[1] DUM+CMS~ T CLRSCRNT “ "" __ 

[2] T YVU~HIVE ACTIVATED THE PLOTTING FUNCTION' 

[3] 'IT IS ASSUMED THAT THE USER IS FAMILIAR WITH THE 
GRAFSTAT PLOT FUNC . ' 

[4] 'AND THE AXIS CONTROL FUNCTION' 

[5] 'IF YOU RECEIVE (MAKE ) AN ERROR MESSAGE DO THE 
FOLLOWING ' 

[6] '1. ENSURE THAT VM READ IS DISPLAYED IN LOWER RIGHT 
CORNER OF SCREEN' 

[7] '2. PRESS THE ENTER KEY' 

[8] '3. ENTER PAGE ' 

[9] 'TO UNDERSCORE A TETTER HOLD THE APL/ ALT KEY DOWN AND 
PRESS THE LETTER ' 

[10] 'THE PLOTTING FUNCTION WILL RESTART AT THE BEGINNING' 
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[ 11 ] 

POINT 
[ 12 ] 

[13] 

:i4i 

[15] 

[16] 

[17] 

[18] 

[19] LB -.'ENTER X VARIABLE (S ) 

ENTERING MORE THAN ONE VARIABLE' 

[20] 'SEPARATE VARIABLES WITH SEMICOLON 

[21] 'E.G. "X" OR ' 'X1;X2' ' ' 

[22] X7<-H 

[23] V UM +CMS ' CLRSCRN ' 

[24] ~'~ENTETT Y VARIABLE (S ) ( ENCLOSED IN 

OF SAME LENGTH AS X)' 

[2 5] 'IF ENTERING MORE THAN ONE VARIABLE, 

SEMICOLON' 



'THE PLOTTING FUNCTION CAN BE EXITED AT ANY INPUT 
BY ENTERING ' 

t t 

'AT ANYTIME THAT YOU EXIT THE PLOTTING FUNCTION' 

'YOU WILL BE IN THE GRAF ST AT WORKSPACE' 

'IF YOU WISH TO RETURN TO CMS ENTER' 

' )0FF HOLD' 

t i 

r i 

LB:' ENTER 



( ENCLOSED IN QUOTES), 
AND USE QUOTES' 



IF 



QUOTES AND MUST BE 
SEPARATE WITH 



[26] 'AND REMEMBER TO USE QUOTES ENCLOSING ENTIRE STRING' 
"27] 'E.G. ' ' Y' ' OR ' ' Y1-, Y2 ' ' ' 

28] YV+® 

29] ~ r ENTER A VECTOR INDICATING TYPE(S) 

ONLY ; INLINE ONLY' 

[30] 'E.G. 0 OR 1 OR 0 1 OR 0 0 OR 1 0 OR 1 1 

[31] 

[32] ( x /TP ) > 0 ) /£ 1 

[3 3] 1 ENTETT TYPE OF SYMBOL CORRESPONDING TO 

ONLY PLOT (I N QUOTES ) ' 

[34] 'E.G. " OR YOU CAN USE .* + x» 

[3 5] SY<-H 

[36] +T(+/TP)=0 )/LP 

[37] LI: 'ENTER A VECTOR INDICATING 
LINE-, 3 =DASH LINE' 



OF PLOT-, 0 =SYM 



EACH SYMBOLS 



[38] 'E.G. 1 OR 3 

TYPES IN GRAF ST AT' 



OR 1 3 OR ANY 



TYPE (S ) OF LINES: 
OTHER COMBINATION 



l=SOLID 
OR LINE 



[39 

[40 

41 

[42] 

[43 

44 

FOR 



TL+® 

LPITL<-1 
*(TpTP)>l )/L2 
SY<- ' 7 T 

L^YDUM+CMS ' CLRSCRN ' 

' ENTER~~S'CALE OF X-AXIS 
PREVIOUS SCALE' 



45] 'E.G. "LIN" 

[46] XS-s-0 

[47] ^~ENTER 
FOR PREVIOUS 



[48] 

[49" 

[50] 

[51" 

[52 

[53 

[54] 

[55' 



SCALE 

SCALE' 



'LIN' 



OR 

OF 

OR 



(IN QUOTES) OR P 
' 'LIN XMIN XMAX' ' OR "P' 



Y-AXIS (IN QUOTES) OR 
' ' LIN YMIN YMAX ' ' OR 



P 

i i p i 



(IN QUOTES) 



(IN QUOTES) 



'E.G. 

YS+U 

T ENTER THE PLOT HEADER (IN QUOTES) OR EMPTY QUOTES' 
'E.G. ' 'TITLE' 'OR " " ' 

TI+® 

DUM+CMS 'CLRSCRN' 

T ENTER~ X-AXIS LABEL (IN QUOTES) OR ' 

'A PAIR OF EMPTY QUOTES FOR NO LABEL 
CONTROL ' 

[56] 'E.G. ' 1 LABEL ' 'OR " " ' 

”57] X2>H 

58] T ENTER Y-AXIS LABEL (IN QUOTES) OR' 

[59] 'A PAIR OF EMPTY QUOTES FOR NO LABEL 
CONTROL ' 

OR " " ' 



OR TO USE AXIS 



OR TO USE AXIS 



[60 

[61] 

[62' 

[63 

[64] 

[65] 
[ 66 ‘ 
[67] 
" 68 ] 



' ' LABLE ' ' 



TO 

NO 



'E.G. 

YL<-& 

^VO YOU WANT 
'ENTER 0 FOR 
CO+El 

VUM+CMS ' CLRSCRN ' 
^LS+tixTCC^O ) 
L3:P+ 1 I 1 
<-0100 



RUN THIS 
OR 1 FOR 



PAGE? 

YES' 
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EXIT 
OR 1 



' PLEASE WAIT RUNNING PAGE 
RUN PAGE SAM 
T Z?0 YOU WANT 
' ENTER 0 FOR 

C0+® 

^(C0 = 1 )/LE 
'DO YOU WANT 
'ENTER 0 FOR 

C0+® 

DUM+CMS ' CLRSCRN ' 

+ 100=11 /LB 
'TEE ONLY THING LEFT 
Lb: 'WITH THE PARTIAL 
CONSTRUCTING ' 

[82] ' ENTER 



[69 

[70 

71 

72 

73 
[74 

75 

76 
77] 
78" 

79 

80 
81 



TO 

NO 



TO 

NO 



THIS FUNCTION? 
FOR YES' 



RESTART THIS FUNCTION?' 
OR 1 FOR YES' 



TO DO IS THE AXIS CONTROL' 

PLOT THAT YOU HAVE JUST FINISHED 



A 3 ELEMENT 



VECTOR 

LINES 



FOR 

AND 



PARTIAL 

SYMBOLS 



PLOT' 

ARE (NOT) 



SHOWN 



[83] '1ST ELEMENT , 1(0): 

ON SCREEN' 

[84] '2ND ELEMENT, 1(0): HEADER AND AXES ARE (NOT) SHOWN ON 
SCREEN ' 

[8 5] ' 3RD ELEMENT, 

(NOT) SHOWN ' 



1(0): AXES, GRIDS, AND GRID LINES ARE 



[ 86 ] 

AND 

[87] 

[ 88 ] 

[89] 
Y = 0' 

[90] 
X=0 ' 

[91] 



'E.G. 1 1 0 WILL SHOW EVERYTHING ON GRAPH EXCEPT AXES 
GRID LINES' 

T ENTER A 4 ELEMENT VECTOR 

'1ST ELEMENT, X-AXIS: 0 = BOTTOM, 2 = TOP , OR 20 = AT 



FOR AXES 
BOTTOM , 



AND 
2 = 



GRID 

TOP, 



CONTROL ' 
OR 20 = 



'2ND ELEMENT, Y-AXIS: 1 = LEFT, 3 h RIGHT, OR 21 = AT 



3RD 



1 =DOTTED , OR 

[92] 'UTH 
1 = DOTTED, OR 

[93] 'E.G. 



SOLID GRID LINES 

[94] 

[95] L8:' PLEASE WAIT 

[96] RUN PAGE SAM 



ELEMENT, VERTICAL GRID LINES: 0 =NO GRID, 

2=SOLID ' 

ELEMENT, HORIZON. GRID LINES: 0 =NO GRID, 

2=SOLID' 

2122 WILL DISPLAY AXIS AT TOP AND LEFT AND 



RUNNING PAGE ' 



[97] LA : DUM<-CMS 'CLRSCRN' 



[98 
[99 
[ 100 ] 
[ 101 ] 
[ 102 " 
[103] 
[104" 
[105 
[106] 
[107" 
[108 
[109] 
[ 110 " 
SYMBOLS 
111 
112 ] 
113] 
114 
[115] 
[116" 
[117 
[118 
[119 
[120 
[ 121 ] 
[ 122 " 
"123 

124] 

125] 
126" 



OR 1’ 
, X=5, 



AND X=ll ' 



' ENTER~X- AXIS TIC MARKS LOCATION VECTOR' 

'OR ENTER 0 FOR STANDARD TIC MARKS' 

'OR ENTER 1 FOR NO TIC MARKS' 

'E.G. 1 5 11 OR A VECTOR NAME OR 0 
'1 5 11 WILL SHOW TIC MARKS AT X=1 
X2VH 

T ENTER X-AXIS SYMBOLS (IN QUOTES ) ' 

'OR ENTER 0 WITHOUT 
'OR ENTER 1 WITHOUT 
'E.G. ' '1970 ; 1971 1 ' 

XY-«-H 

T ENTER X-AXIS SYMBOLS LOCATIONS VECTOR' 

'OR ENTER 0 FOR SYMBOLS AT DEFAULT LOCATIONS 



UOTES FOR STANDARD SYMBOLS' 
UOTES FOR NO SYMBOLS' 

R A VECTOR NAME OR 0 OR 1' 



VECTOR NAME 
1970 AT X=6 



OR 0 ’ 

AND 1971 



i 

'E.G. 6 18 OR A 
'6 18 WILL SHOW 
XO+& 

DUM+CMS 'CLRSCRN' 

T ENTER~Y- axis tic marks LOCATION VECTOR ' 
'OR ENTER 0 FOR STANDARD TIC MARKS' 

'OR ENTER 1 FOR NO TIC MARKS' 

'E.G. 101 OR A VECTOR NAME OR 0 OR 1' 

’ ~1 0 1 WILL SHOW TIC MARKS AT Y=~ 1, Y=0 
Y2V0 

t enter y-axis symbols (in quotes)' 



OR NO 



AT X=1 8 ' 



AND Y=1 ' 



'OR ENTER 0 WITHOUT 
'OR ENTER 1 WITHOUT 
'E.G. ' 'LO MID HI' ' 
YY^Q 

t enter y-axis symbols 



UOTES FOR STANDARD SYMBOLS' 
UOTES FOR NO SYMBOLS' 

R VECTOR NAME OR 0 OR 1' 

LOCATIONS VECTOR' 
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[127] 'OR ENTER 0 FOR SYMBOLS AT DEFAULT LOCATIONS OR NO 
SYMBOLS ' 

[128] 'I.E. 



129] 

[130] 

[131] 

[132] 

[133] 

[134] 

[1357 
[136! 

[137] 

[138] 

[139] 

[140] 

[141] 

[142] 

[1437 
'144' 

WORK' 

[145] 

CMS' 

[146] 

[147] 

AFTER EXIT' 



101 OR VECTOR NAME OR 0' 

,_ 1 0 1 WILL SHOW LO AT Y=~l , MID AT YsO, HI AT Y= 1» 

r THESE AXIS CONTROL ENTRIES WILL NOW BE RUN' 

RUN PAGE AX 

T D0 YOU WANT TO RERUN THE PLOT INPUTS YOU ENTERED' 
'BEFORE RUNNING THIS AXIS CONTROL FUNCTION?' 

'ENTER 0 FOR NO OR 1 FOR YES' 

C0+® 

+(C0=1 )/L 6 

'DO YOU WANT TO DO ANOTHER AXIS CONTROL PAGE?' 

'ENTER 0 FOR NO OR 1 FOR YES' 

CO*-® 

+ (C0 = 1 )/L 8 

'DO YOU WANT TO RESTART THE FUNCTION?' 

LEl'IF YOU DO NOT YOU WILL EXIT THIS FUNCTION' 

'IF YOU EXIT THIS FUNCTION AND WANT TO RETAIN THIS 

'USE THE KEEP FUNCTION AND THEN YOU CAN RETURN TO 

'BY ENTERING )OFF HOLD ' 

'IF YOU WANT TO RETURN TO CMS, SIMPLY ENTER )OFF HOLD 



148] 

149] 

[150] 

[151] 
V 



'ENTER 0 FOR EXIT OR 1 FOR RESTART ' 
C0+® 

5(C0=1 )/LB 
\ 0 ~ 
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APPENDIX B 



SPLIT LINEAR FIT PROGRAM 



1. SE XL1N EXEC 



The following file is the exec file, SPTLIN EXEC, which activates 
and runs the Split Linear Fit smoothing program. Chapter VI contains 
instructions on how to use this smoothing program. 



VLNKMLIB VALTLIB VFORTLIB IMSLSP NONIMSL 



&TRACE 
SET BLIP * 

GLOBAL TXT LIB 
CMSLIB 
CLRSCRN 

&.TYPE YOU HAVE INITIATED AN ALGORITHM 
&TYPE TO SMOOTH A SET OF DATA USING 
&TYPE ' SMOOTHING WITH SPLIT LINEAR FITS' 

&TYPE DEVELOPED BY MCDONALD AND OWEN OF 
&TYPE STANFORD UNIVERSITY DEPT. OF STATISTICS 
- STRT &TYPE ******************************************* 
&TYPE IN ORDER TO USE THIS ALGORITHM USE A 2M MACHINE ** 
S (TYPE ************************************************* 
&TYPE 
&TYPE 

&TYPE DO YOU WISH TO CONTINUE? 

&TYPE ENTER Y FOR YES OR ANY OTHER KEY TO EXIT : 

&READ VAR &CONT 
CLRSCRN 

&JF &CONT NE Y &GOTO -EXIT 

&TYPE IN ORDER TO USE THIS ALGORITHM YOU MUST 
&TYPE HAVE ON HAND THE FOLLOWING : 

&.TYPE 

FILENAME OF DATA FILE ( FILETYPE DATA) 

WITH DATA TO BE SMOOTHED. 



&TYPE 1 
&TYPE 
&.TYPE 
&TYPE 2 
&TYPE 
&.TYPE 
&TYPE 
&.TYPE 
&.TYPE 3 
&TYPE 
&.TYPE 
&TYPE 
&TYPE 
&TYPE 
&TYPE 4 
&TYPE 
&TYPE 
STTYPE 
&TYPE 
&TYPE 5 
&.TYPE 
&TYPE 

&TYPE DO YOU WISH TO CONTINUE? 

&TYPE ENTER Y FOR YES OR ANY OTHER KEY TO EXIT: 
&READ VAR &CONT 
CLRSCRN 

&IF &CONT NE Y &GOTO -EXIT 

& TYPE CAN YOU ACCESS 2M OF STORAGE ON THIS DISK? 



'IF DATA POINTS ARE NOT IN CHRONOLOGICAL ORDER, 
YOU NEED TO HAVE A FILE ( FILETYPE ORDER ) 

WITH INDICES CORRESPONDING TO DATA POINTS 
INDICATING THE ORDER OF THE DATA POINTS. 

FILENAME OF DATA FILE WHERE SMOOTHED OUTPUT 
WILL BE WRITTEN OR IF YOU WANT 
TO WRITE OUTPUT INTO APL HAVE ON HAND 
THE APL VARIABLE AND WORKSPACE NAMES 
THAT WILL STORE THE OUTPUT. 

THE NUMBER OF WINDOW SIZES 
AND THE VALUE OF THE WINDOW SIZES 
THAT YOU WANT TO ATTEMPT 
ON THE SMOOTHING OF THE DATA. 

THE MINIMUM WINDOW SIZE THAT 
CAN BE ATTEMPTED BY THE ALGORITHM. 
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&TYPE ENTER Y FOR YES OR N FOR NO: 

SREAD VAR SCONT 

&IF SCONT EQ N SGOTO - STRT 

StIF SCONT NE Y &COTO -EXIT 

STYPE ENTER FILENAME OF FILE WHICH 

SHY PE CONTAINS THE DATA TO BE SMOOTHED: 

SREAD ARGS 

SIF SN = 0 SGOTO -TELL 

&IF SN > 1 SGOTO -TELL 

STATE SI DATA A 1 

SIF SRC NE 0 SGOTO -ERROR 

CLRSCRN 

-ORDR CLRSCRN 

STYPE ARE DATA POINTS TO BE SMOOTHED 
7 type IN CHRONOLOGICAL ORDER? 

STYPE ENTER Y FOR YES OR N FOR NO: 

SREAD VAR SCONT 
SIF SCONT EQ Y SGOTO -GO 
STYPE THE DATA POINTS ARE NOT 
STYPE IN CHRONOLOGICAL ORDER? 

STYPE ENTER FILENAME OF FILE (FILE TYPE ORDER ) 

STYPE THAT CONTAINS ORDER INDICES 

SREAD VAR SORD 

STATE SORD ORDER A 1 

SIF SRC NE 0 SGOTO -ERROR 

_ qq CLRSCRN 

STYPE THE DATA YOU WANT TO SMOOTH 
STYPE IS IN <51 DATA 

STYPE WHERE DO YOU WANT TO WRITE THE SMOOTHED OUTPUT? 

STYPE CMS OR APL 

STYPE OR YOU CAN PLOT THE SMOOTHED OUTPUT 

STYPE SINCE YOU ARE LOGGED ON A 

STYPE TERMINAL THAT CAN ACCESS GRAF ST AT , 

STYPE BUT THE OUTPUT MUST BE STORED 
STYPE IN AN APL VARIABLE. 

STYPE ENTER APL OR CMS 

SREAD VARS SFLA 

SIF SPLA EQ APL SGOTO -API 

STYPE 

CLRSCRN 

STYPE THE SMOOTHED OUTPUT WILL BE WRITTEN 
STYPE INTO A CMS FILE (FILE TYPE DATA ) 

STYPE ENTER ONLY THE FILENAME. 

SREAD VAR SFN 

STYPE THE SMOOTHED OUTPUT WILL BE WRITTEN 
STYPE INTO THE FILE SFN DATA 
SGOTO -COM 
-API STYPE 

STYPE ENTER THE NAME OF THE 

STYPE APL VARIABLE TO HOLD THE OUTPUT 

SREAD VAR SA 

STYPE DO YOU WANT TO PLOT THE OUTPUT? 

STYPE ENTER Y FOR YES OR N FOR NO: 

SREAD VAR SGRF 

SIF SGRF EQ Y SGOTO -PLOT 

STYPE ENTER THE NAME OF THE APL WORKSPACE 
STYPE THAT WILL CONTAIN SA 
SREAD VAR SWKS 

STYPE IS SWKS AN EXISTING WORKSPACE OR A NEW WORKSPACE? 
STYPE ENTER 0 FOR EXISTING OR N FOR NEW: 

SREAD VAR SAGE 
-PLOT SFN = TE 
-COM STYPE 

STYPE PLEASE READ THE FOLLOWING INSTRUCTIONS VERY CAREFULLY 
STYPE ARE YOU READY TO START THE SMOOTHING PROGRAM? 

STYPE ENTER Y FOR YES ANY OTHER KEY TO EXIT 

SREAD VARS SO 

CLRSCRN 

SIF SO NE Y SGOTO -EXIT 

STYPE PLEASE WAIT THE SMOOTHING PROGRAM IS BEING COMPILED 
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FI 04 CLEAR 
FI 05 CLEAR 
FI 06 CLEAR 
FI 07 CLEAR 
FI 08 CLEAR 
FI 09 CLEAR 

FORTVS SPTLIN (LVL ( 77 ) SDUMP 
FIL 04 DISK &ORD ORDER 
FIL 07 DISK &1 DATA 

FIL 08 DISK &FN DATA ( RECFM FBA LRECL 80 BLKSIZE 800 ) 
CLRSCRN 

& TYPE PLEASE WAIT SMOOTHING PROGRAM IS BEING LOADED 

LOAD SPTLIN (.START 

CLRSCRN 

ERASE SPTLIN LISTING 

ERASE SPTLIN TEXT 

ERASE LOAD MAP 

&IF &PLA EQ CMS &GOTO -EX 

&IF &GRF EQ N &GOTO -NGRF 

CP TERMINAL APL ON 

&STACK )LOAD SPTLIN 

&STACK <£4 +CMSREAD 

& STACK & FN 

&STACK DATA 

&STACK N 

& STACK &A + &A 

&STACK <£1 +C MS RE AD 

&STACK &1 

&.STACK DATA 

&STACK N 

&STACK <£1 <- ,&1 

&STACK )SAVE 

ScTYPE ****PLEASE WAIT, LINKING TO GRAFSTAT************** 
&STACK )LOAD GRAFSTAT 
&STACK DUM +CMS » CLRSCRN ' 

&STACK 1FC0P1~SPTLIN 
&STACK ST RT 
EXEC APLGS~ 

&.GOTO -DRP 

-NGRF CP TERMINAL APL ON 

& STACK )LOAD SPTLIN 

&STACK &A +CMSREAD 

&STACK & FN 

&STACK DATA 

&STACK N 

& STACK &71 

&STACK )SAVE 

&STACK ) CLEAR 

&IF &AGE EQ 0 &STACK )LOAD &WKS 

&IF &AGE EQ N &STACK )WSID &NKS 

&STACK )PCOPY SPTLIN 

&STACK }SAVE 

&STACK )OFF HOLD 

EXEC APL 

- DRP ERASE &FN DATA * 

CP TERMINAL APL ON 
& STACK }LOAD SPTLIN 
&STACK ) ERASE &A 
&STACK )ERASE <£1 
&STACK )SAVE 
&STACK )OFF HOLD 
EXEC APL 

-EX &TYPE YOU HAVE FINISHED. 

&EXIT 1000 

- TELL &.TYPE YOU HAVE ENTERED TOO MANY OR 
&TYPE NOT ENOUGH ENTRIES ABOUT DATA FILE 
&TYPE YOU NEED TO BEGIN AGAIN 
&TYPE ENTER : SPTLIN 

&EXIT 100 

-ERROR &.TYPE ABOVE ENTERED FILE DATA 
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(Sc TYPE DOES NOT EXIST ON YOUR A- DISK 

&TYPE CHECK YOUR FLIST AND THEN BEGIN AGAIN BY ENTERING 
SHY PE SPTLIN 

& EXIT 101 

- EXIT SHYPE YOU HAVE FORCED AN EXIT ON THIS SMOOTHING EXEC 
&TYPE IF YOU WISH TO BEGIN AGAIN ENTER 
&TYPE SPTLIN 

& EXIT 102 



2. SPTLIN FORTRAN 



The following file is SPTLIN FORTRAN which does the actual 

smoothing of a data set. The subroutines used in this program were 

developed by McDonald and Owen [Ref. 10] as stated in Chapter I. 

These subroutines were originally written in the C computer language. 

The author of this thesis translated the C language subroutines and 

combined them into an interactive FORTRAN program which follows. 

DOUBLE PRECISION WEIGHT,Y(1000),X(1000),W(1000),TRY(1000,8,9) 

DOUBLE PRECISION TSMO(1000),SMOOTH(1000) 

REAL INFIN, CEPS, MISVAL, RESCAL, WTPOW 

INTEGER CMXOBS, CMXTRY, MNWNSZ, BASE, TRYSPN(IO), NOBS, NTRYS, 

*NOTMIS 

COMMON /CONSTS/ IN FI N,CEPS, MISVAL, RESCAL, WTPOW 
WRITE(5,2) 

2 FORMAT(1X, 'ENTER THE NUMBER OF DATA POINTS TO BE SMOOTHED 

*— INTEGER VALUE:',/) 

READ(6,*)NOBS 

13 DO 15 1 = 1, NOBS 

15 W(l) = 1. 

14 WRITE(5,16) 

16 FORMAT(1X,'ARE THE INPUT DATA POINTS IN CHRONOLOGICAL ORDER?' 

*,/, IX, 'ENTER 0 FOR NO OR 1 FOR YES',/) 

READ(6,*) ODR 
IF(0DR.E0.1 )GO TO 17 
READ(4,*)(X(I), 1 = 1, NOBS) 

GO TO 18 

17 DO 19 1 = 1, NOBS 

19 X(l) = FLOAT(I) 

1 8 CALL FRTCMS('CLRSCRN ') 

WRITE(5,5) 

5 FORMAT(1X, 'ENTER THE NUMBER WINDOWS TO BE USED 

* INTEGER VALUE') 

READ(6,*)NTRYS 

CALL FRTCMS('CLRSCRN ') 

WRITE(5,10) 

10 FORMAT(1X,'NEXT ENTER THE WINDOW SIZES IN INCREASING ORDER 
*— — INTEGER VALUES',/) 

DO 8 1 = 1, NTRYS 
WRITE(5,9)I 

9 FORMAT(1X, 'ENTER WINDOW SIZE NUMBER', 14,/) 

READ(6,-)TRYSPN(I) 

CALL FRTCMS('CLRSCRN ') 

8 CONTINUE 

WRITE(5,1 1 ) 

11 FORMAT(1X, 'ENTER VALUE OF THE MINIMUM WINDOW SIZE 
* — INTEGER',/) 

READ(6,*)MNWNSZ 
CALL FRTCMS('CLRSCRN ') 

WRITE(5,20) 

20 FORMAT(1X/*****PLEASE WAIT PROGRAM NOW RUNNING********') 

RESCAL = 1.0 
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WTPOW - 2.0 
READ(7,*)(Y(I),I - 1, NOBS) 

IF(ODR.E0.1)GO TO 21 
CALL SORTER(X l W,Y,N) 

21 CONTINUE 

DO 12 1“ 1, NTRYS 
DO 12 J = 1.NOBS 

TRY(J,I,1 ) “0.0 
TRY(J,I,2) = 0.0 
TRY(J,I,3)“0.0 
TRY(J,I,4) = 0.0 
TRY(J,I,5) = 0.0 
TRY(J,I,6) = 0.0 
TRY(J,I,7) = 0.0 
TRY(J,I,8) = 0.0 
TRY(J,I,9) = 0.0 
12 CONTINUE 

DO 6 I = 1, NTRYS 

CALL RUNLRC(NOBS, X, Y, W. NTRYS, MNWNSZ, TRYSPN.l.TRY) 

6 CONTINUE 

CALL COM PWT( NOBS, NTRYS, TRY) 

CALL COMTRY(NOBS, NTRYS, TSMO, TRY) 

DO 7 1 = 1, NTRYS 

CALL RUNLRC(NOBS, X, TSMO, W, NTRYS, MNWNSZ, TRYSPN.I, TRY) 

7 CONTINUE 

CALL COMPWT(NOBS, NTRYS.TRY) 

CALL COMTRY(NOBS, NTRYS, SMOOTH, TRY) 

WRITE(8,4) (SMOOTH(I).I = 1, NOBS) 

4 FORMAT(2X,5(F12.6,2X)) 

STOP 

END 






SUBROUTINE COMPWT(NOBS, NTRYS.TRY) 

DOUBLE PRECISION LAMBDA, TEMP, MIN, TRY(NOBS, NTRYS, 9) 
REAL INFIN, CEPS, MISVAL.RESCAL, WTPOW 
INTEGER NOBS, NTRYS, NTMS, A 

COMMON /CONSTS/ INFIN, CEPS, MISVAL.RESCAL, WTPOW 
DO 1 J = 1, NOBS 
MIN = INFIN 
LAMBDA = 0. 

NTMS = 0. 

DO 2 I = 1, NTRYS 
TEMP = TRY( J, 1,6) 

A = NOTMIS(TEMP, MISVAL) 

IF(A.E0.1)GO TO 3 
GO TO 4 

3 NTMS = NTMS + 1 
LAMBDA = LAMBDA + TEMP 
IF(TEMP.LT.MIN)MIN =TEMP 

4 TEMP = TRY(J,I,4) 

A = NOTMIS(TEMP, MISVAL) 

IF(A.E0.1 )GO TO 5 
GO TO 6 

5 NTMS = NTMS + 1 

LAMBDA = LAMBDA + TEMP 
IF(TEMP.LT.MIN)MIN “TEMP 

6 TEMP “ TRY(J,I,5) 

A = NOTMIS(TEMP, MISVAL) 

IF(A.E0.1 )GO TO 7 
GO TO 2 

7 NTMS = NTMS + 1 

LAMBDA “ LAMBDA + TEMP 
IF(TEMP.LT.MIN)MIN = TEMP 

2 CONTINUE 

LAMBDA = LAMBDA/NTMS 
LAMBDA = LAMBDA - MIN 

IF(LAMBDA.GT.0.)LAMBDA = 1./(LAMBDA*RESCAL) 
IF(MIN.LE.0.)MIN = CEPS 
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8 

1 



* A ******* 



DO 8 I — 1, NTRYS 

TRY(J,I,9) - WEIGHT(TRY(J,I,6), MIN, LAMBDA, J.l) 
TRY(J,I,7) « WEIGHT(TRY(J,I,4), MIN, LAMBDA, J,l) 
TRY(J,I,8) - WEIGHT(TRY(J,I,5), MIN, LAMBDA, J,l) 
CONTINUE 
CONTINUE 
RETURN 
END 



******* 



3 



4 



2 




*************************************************** 



SUBROUTINE COMTRY(NOBS, NTRYS, SMOOTH, TRY) 

DOUBLE PRECISION RSUM,WSUM,T,SMOOTH(NOBS),TRY(NOBS,NTRYS,9) 
REAL CEPS.MISVAL 
INTEGER NOBS, NTRYS, A 

COMMON /CONSTS/ INFIN,CEPS,MISVAL,RESCAL,WTPOW 
DO 1 J - 1, NOBS 
RSUM - 0. 

WSUM - 0. 

DO 2 I - 1, NTRYS 
T - TRY(J,I,3) 

A = NOTMIS(T,MISVAL) 

IF(A.E0.1)THEN 

RSUM - RSUM + TRY(J,I,9)*TRY(J,I,3) 

WSUM - WSUM + TRY(J,I,9) 

ELSE 

GO TO 3 
ENDIF 

T = TRY(J,I,1) 

A = NOTM IS(T, M ISVAL) 

IFfA.EO.1 )THEN 

RSUM = RSUM + TRY(J,I,7)*TRY(J,I,1) 

WSUM = WSUM + TRY(J,I,7) 

ELSE 

GO TO 4 
ENDIF 

T = TRY(J,I,2) 

A - NOTM IS(T,M ISVAL) 

IF(A.E0.1)THEN 

RSUM = RSUM + TRY(J,I,8)*TRY(J,I,2) 

WSUM = WSUM + TRY(J,I,8) 

ELSE 

GO TO 2 
ENDIF 
CONTINUE 

IF(WSUM.GE.CEPS)THEN 

SMOOTH(J) = RSUM/WSUM 
ELSE 

SMOOTH(J) = MISVAL 
ENDIF 
CONTINUE 
RETURN 
END 



SUBROUTINE RUNLRC(NOBS, X, Y, W, NTRYS, MNWNSZ, TRYSPN, I, TRY) 
DOUBLE PRECISION AT,BT,MEANRQ,YMEAN,Y2MEAN, 

* SLOPE, INTER, X(NOBS),Y(NOBS),W(NOBS) 

DOUBLE PRECISION XSUM,YSUM,X2SUM,Y2SUM,XYSUM, 

* WSUM, XVAR,XMEAN,X2MEAN,XYMEAN, EPS, TRY(NOBS, NTRYS, 9) 

REAL CEPS.MISVAL 

INTEGER NOBS, NTRYS, JL, JR, TRYSPN(NTRYS), MNWNSZ, 

* RFLAG.CFLAG.LFLAG, I, JC 

COMMON /CONSTS/ INFIN,CEPS,MISVAL,RESCAL,WTPOW 
JL - NOBS/4 
JR - 3\JL 
EPS = X(JR) -X(JL) 

2 IF(EPS.LE.O.OAND.JR.LT.NOBS)THEN 

IF(JR.LT.NOBS)JR = JR + 1 
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1 

4 

6 

7 

8 

9 

10 

11 

12 

13 

14 

15 

16 

17 



IF(JLGT.1 )JL - JL- 1 
EPS - X(JR) -X(JL) 

ELSE 

GO TO 1 
ENDIF 
GO TO 2 
CONTINUE 
EPS = EPS*CEPS 
EPS - EPS**2 

IF(TRYSPN(I).LT.MNWNSZ)TRYSPN(I) - MNWNSZ 
XSUM - 0. 

YSUM - 0. 

WSUM = 0. 

X2SUM = 0. 

Y2SUM = 0. 

XYSUM = 0. 

Kl = MNWNSZ - 1 
DO 4 JR — 1, Kl 

CALL UPDATE(1 , X(JR), Y(JR), W(JR), XSUM, YSUM, 
WSUM,X2SUM,Y2SUM, XYSUM) 

TRY(JR,I,1) - MISVAL 
TRY(JR,I,4) = MISVAL 
JL - JR - TRYSPN(I) + 1 
JC = JR - TRYSPN(l)/2 
KT = NOBS - MNWNSZ + 1 
IF(JLGE.1 )GO TO 7 
LFLAG = 0 
GO TO 8 
LFLAG = 1 

IF(JC.GE.1 AND.JC.LE.NOBS)GO TO 9 
CFLAG - 0 
GO TO 10 
CFLAG = 1 

IF(JR.LE.NOBS)GO TO 1 1 
RFLAG = 0 
GO TO 12 
RFLAG = 1 

IF(RFLAG.E0.1)GO TO 13 
GOTO 14 

CALL UPDATE(1 , X(JR), Y(JR), W(JR), XSUM, YSUM, 
WSUM,X2SUM,Y2SUM, XYSUM) 

XMEAN = XSUM/WSUM 
X2MEAN = X2SUM/WSUM 
YMEAN = YSUM/WSUM 
XYMEAN - XYSUM /WSUM 
Y2MEAN - Y2SUM/WSUM 
XV A R - X2MEAN - XMEAN**2 
I F(XVAR. LE. EPS)TH EN 
SLOPE - 0. 

ELSE 

SLOPE - (XYMEAN - XMEAN* YMEAN)/XVAR 

ENDIF 

INTER - YMEAN -SLOPE*XMEAN 
MEANRO = Y2MEAN -(2.*INTER*YMEAN)- 
(2.*SLOPE*XYMEAN) + (INTER**2) + 

(2.*INTER*SLOPE*XMEAN) + (X2MEAN*SLOPE**2) 

IF(LFLAG.E0.1 )GO TO 15 
GOTO 16 
AT = TRY(JL,I,3) 

BT-TRY(JL,I,6) 

CALL EVALFT(INTER, SLOPE, MEANRO, JL,W,X,Y,AT,BT,I,WSUM) 

TRY(JU,3)«AT 

TRY(JUI,6)«BT 

IFfRFLAG.EO.I )GO TO 17 

GOTO 18 

AT -TRY(JR,I,1) 

BT -TRY(JR f l,4) 

CALL EVALFT(INTER,SLOPE,MEANRO,JR,W,X,Y,AT,BT,I,WSUM) 
TRY(JR,I,1 ) = AT 
TRY(JR,I,4) = BT 
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bob bob bob 



18 

19 



20 

21 

22 



23 



IF(CFLAG.E0.1 )GO TO 19 
GO TO 20 
AT“TRY(JC,I,2) 

BT“TRY(JC,I,5) 

CALL EVALFT(INTER, SLOPE, MEANRQ,JC,W,X,Y, AT, BT.I.WSUM) 

TRY(JC,I # 2)-AT 

TRY(JC,l f 5) — BT 

IF(LFU\G.E0.1)G0 TO 21 

GO TO 22 

CALL UPDATE(0,X(JL),Y(JL),W(JL), XSUM, YSUM, 
WSUM,X2SUM,Y2SUM,XYSUM) 

JR - JR + 1 
JL « JL + 1 
JC - JC + 1 
IF(JLLE.KT)GO TO 6 
KL = JL 

DO 23 JL - KL, NOBS 
TRY(JU,3) - MISVAL 
TRY(JLJ,6) - MISVAL 
RETURN 
END 

:*ft*****************ft************************* 



:***** 



1 



************************************************* 

INTEGER FUNCTION NOTMIS(AO, MISVAL) 
DOUBLE PRECISION AO 
REAL MISVAL 

IF(AO.GT.MISVAL)GO TO 1 
NOTMIS - 0 
RETURN 

NOTMIS - 1 
RETURN 
END 

************************************************* 



REAL FUNCTION WE1GHT(R,INTER2,SL0PE2,J,I) 

DOUBLE PRECISION R,TEMP2,INTER2,SLOPE2 
REAL WTPOW, MISVAL 
INTEGER A, J 

COMMON /CONSTS/ I N FIN, CEPS, MISVAURESCAL, WTPOW 
A - NOTMIS(R, MISVAL) 

IF(A.E0.1 )THEN 

TEMP2 - SLOPE2*(R - INTER2) 
IF(TEMP2.LE.0.)THEN 
WEIGHT = 1. 

ELSE 

I F(TEM P2. LT. 1 . )TH EN 

WEIGHT - ((1.-TEMP2)**(INT(WTPOW)))*( 
(1.-TEMP2)**(WTPOW- INT(WTPOW))) 

ELSE 

WEIGHT - 0. 

ENDIF 

ENDIF 

ELSE 

WEIGHT - 0. 

ENDIF 

RETURN 

END 



******************************************************** 



******* 



*********************************************** 

SUBROUTINE UPDATE(OP,A1,B1,C1,XSUM,YSUM,WSUM,X2SUM, 
■Y2SUM.XYSUM) 

DOUBLE PRECISION XSUM.YSUM.WSUM^SUM/r^SUM.XYSUM.Al.BI.CI 
INTEGER OP 

IF(OP.EO.O)THEN 

XSUM - XSUM - C1*A1 
YSUM = YSUM -C1*B1 
WSUM - WSUM - Cl 
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X2SUM - X2SUM - C1*Ar*2 
Y2SUM - Y2SUM - C1*B1**2 
XYSUM - XYSUM - C1*A1*B1 
ELSE 

XSUM - XSUM + crAi 
YSUM - YSUM + C1-B1 
WSUM - WSUM + Cl 
X2SUM - X2SUM + C1*Ar*2 
Y2SUM - Y2SUM + C1*BP*2 
XYSUM - XYSUM + C1*A1*B1 
ENDIF 
RETURN 
END 



;AAAAAAAAAAAAAA**AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA 

S U B RO UTI N E EVALFT(A, B, R2, J I, W.X, Y.TRYS.TRYR, I, WSUM) 
DOUBLE PRECISION TRYS,TRYR,WSUM,A,B,R2,W(JI),X(JI), 
*Y(JI),FIT,RES 
INTEGER Jl,l 

FIT - A + B*X(JI) 

RES - Y(JI) - FIT 
TRYS = FIT 

TRYR - (WSUM*R2 - W(JI)*RES*RES)/(WSUM - W(JI)) 
RETURN 
END 



I******************************************************** 

SUBROUTINE SORTER(X,W,Y,N) 

DOUBLE PRECISION X(N),W(N),Y(N),D(5000) 
INTEGER N,KEY(5000) 

DO 5 I = 1 ,N 

5 KEY(I) = I 

CALL SHSORT(X, KEY, N) 

DO 1 I = 1,N 

1 D(l) - W(l) 

DO 2 1 = 1, N 

J = KEY(I) 

2 W(I) = D(J) 

DO 3 1-1, N 

3 D(l)-Y(l) 

DO 4 I — 1,N 

J = KEY(I) 

4 Y(I)=D(J) 

RETURN 

END 



AAAAAAA ********************** **************************** 



A****: 



AAAAAA AAAAAAAAAAAA AA AA AA AAAA AAAAAAAAAAA AAAAAAAAA AAA 

BLOCK DATA 

COMMON /CONSTS/ INFIN,CEPS,MISVAL,RESCAL,WTPOW 

REAL INFIN,CEPS,MISVAL,RESCAL,WTPOW 

DATA INFIN,CEPS,MISVAL,RESCAL,WTPOW /1 .0E30.1 .0E-10, 



*-1.0E30,1.0,2.0/ 



END 



3. SPTLIN VSAPLWS 



The following two APL functions are used in conjunction with the 
two files listed above. They were developed by the author of this 
thesis in order to plot the smoothed data/results. They are the main 
APL functions within the APL workspace SPTLIN. The first APL 
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function links the user of the smoothing program with GRAFSTAT and 
gives a user familiar with GRAFSTAT the opportunity to proceed into 
GRAFSTAT where a greater variety of graphic functions are available. 



V ST RT ; C 

[1] DUM+CMS 1 CLRSCRN ' 

[2] ~ r THE ENTIRE DATA FILE THAT YOU WANTED SMOOTHED HAS 
BEEN TRANSFEKRETP ~ 

[3] 1 TO THIS WORKSPACE SO THAT YOU MAY BE ABLE TO PLOT 
BOTH' 

[4] 'THE SMOOTHED AND UN SMOOTHED DATA.' 

[5] 'THE UN SMOOTHED DATA IS IN THE VARIABLE WITH THE SAME 
NAME AS' 

'THE DATA FILE THAT YOU HAVE YOUR INPUT DATA IN. ' 

i i 

t i 

'DO YOU WISH TO GO INTO APL OR CONTINUE?' 

'ENTER 0 FOR APL OR 1 FOR CONTINUE' 

C <- 0 

■^13 8 

' YOU WILL BE SENT TO APL AFTER YOU HAVE READ THIS 
IMPORTANT TEXT. ' 

[14] 1 ***AFTER YOU HAVE FINISHED WORKING IN APL AND WISH TO 
PLOT THE DATA' 

[15] '***ENTER PLOTER*******NOTICE THAT PLOTER HAS ONLY ONE 



C6] 

[7] 

C8] 

[9] 

[10] 
[ 11 ] 
[ 12 ] 
[13] 



T 1 

[16] 

[17] 

[18] 

[19] 

[ 20 ] 
[ 21 ] 



t i 

i i 

' NOW ENTER 0 AGAIN' 
C «-0 

PLOTER 



V 



The next APL function creates the APL variables to used in the APL 



'PLOT' screen. This plotting option is made available to the user 
through the above APL function. The user can use this APL function 
to do the plotting or the GRAFSTAT graphics functions. A user need 
not fully understand how to use the GRAFSTAT plot screen in order to 
use this function. Several examples are shown with each of the queries 



so that the user can see what the entry should look like. 



PLOTER ; 



[1_ 

[ 2 ] 

[3] 



;COiDUMiP;SY;TI;TL;TP;XL;XO;XS;XT;XY;XV;YL;YO;YS;YT;Y V 



DUM+CMS T CLRSCRN' 

r YVU~HAVE ACTIVATED THE PLOTTING FUNCTION' 

' IT IS ASSUMED THAT THE USER IS FAMILIAR 
GRAFSTAT PLOT FUNC . ' 

[4] 1 AND THE AXIS CONTROL FUNCTION' 

[5] 'IF YOU RECEIVE (MAKE ) AN ERROR 
FOLLOWING ' 

[6] '1. ENSURE THAT VM READ IS DISPLAYED IN LOWER 
CORNER OF SCREEN' 



WITH THE 



MESSAGE 



DO THE 
RIGHT 



[7 

" 8 ] 

9" 



'2. PRESS THE ENTER KEY' 

'3. ENTER PAGE' 

'TO UNDERSCORE A ~LETTER HOLD THE APL /ALT KEY DOWN AND 
PRESS THE LETTER' 

[10] 'THE PLOTTING FUNCTION WILL RESTART AT THE BEGINNING' 

[11] 1 THE PLOTTING FUNCTION CAN BE EXITED AT ANY INPUT 
POINT BY ENTERING' 

[ 12 ] ' 
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[13] 

'14 

15] 

16" 

[17 

[18] 

"19" 



' AT ANYTIME THAT YOU EXIT THE PLOTTING FUNCTION ' 

' YOU WILL BE IN THE GRAF ST AT WORKSPACE' 

'IF YOU WISH TO RETURN TO CMS ENTER' 

' )OFF HOLD ' 

i ■ 

i i 

LB '.'ENTER X VARIABLE (Si ( ( ENCLOSED IN QUOTES'), 



IF 



ENTERING MORE THAN ONE VARIABLE 

"20] 1 SEPARATE VARIABLES WITH SEMICOLON AND USE QUOTES' 

[21] 'E.G. ' 'X' ' OR "X1',X2" ' 

[22] XF^0 

23] DUM <-CMS ' CLRSCRN ' 

[24] ~''ENTEE~ Y VARIABLE (S ) ( ENCLOSED IN QUOTES AND MUST BE 

OF SAME LENGTH AS X) 

i 

[2 5] 'IF ENTERING MORE THAN ONE VARIABLE , SEPARATE WITH 

SEMICOLON ' 

[26] 1 AND REMEMBER TO USE QUOTES ENCLOSING ENTIRE STRING' 

[27] 'E.G. ' 'Y' ' OR ' 'Y1',Y2' ' ' 

[28] YF^0 

[29] ~ r ENTER A VECTOR INDICATING TYPE(S) OF 
ONLY', INLINE ONLY' 

[30] 'E.G. 0 OR 1 OR 0 1 OR 0 0 OR 1 0 OR 1 1' 

[31] 2T^0 

[32] +T (x/TP)>0 )/Ll 

[3 3] 'ENTETT TYPE OF SYMBOL CORRESPONDING TO 
ONLY PLOT (IN QUOTES) 



PLOT', 0 =SYM 



EACH SYMBOLS 



[34 

[35 

[36 

[37 



OR YOU CAN USE .*+x' 



'E.G. 

SY+0 

^T(+/2 , P) = 0 )/LP 

LI '. ' ENTER A VECTOR INDICATING TYPE(S) OF LINES', 1 = SO LID 
LINE', 3 =DASH LINE' 

[38] 'E.G. 1 OR 3 

TYPES IN GRAF ST AT ' 



OR 1 3 OR ANY OTHER COMBINATION OR LINE 
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TL + 0 



-*(TpPP)>l )/L 2 
SY^' _T 



[40] LP:TL<-1 

: 4 1 : 

[42] 

[43] LIYDUM+CMS 'CLRSCRN' 

[44] ' ENTER 'SCALE OF X-AXIS (IN QUOTES) OR P (IN QUOTES) 
FOR PREVIOUS SCALE' 

[45] 'E.G. "LIN" OR "LIN XMIN XMAX" OR "P" < 

[46] XS^0 

[47] i ENTER SCALE OF Y-AXIS (IN QUOTES) OR P (IN QUOTES) 
FOR PREVIOUS SCALE' 

[48" 

"49 

50] 

51 
52] 

53" 

54 

55 



CONTROL 



'E.G. " LIN " OR " LIN YMIN YMAX " OR "P" ' 

Y5^0 

r ENTER THE PLOT HEADER (IN QUOTES) OR EMPTY QUOTES' 
'E.G. ' ' TITLE ' ' OR ' ' ' ' ' 

PJ-e-0 

DUM+CMS 'CLRSCRN' 

T ENTER~X- axis LABEL (IN QUOTES) OR ' 

'A PAIR OF EMPTY QUOTES FOR NO LABEL OR TO USE AXIS 



[56 

[57] 

[58] 

[59] 



CONTROL 1 



'E.G. ' 'LABEL' 'OR " " ' 

XZ>0 

T ENTER Y-AXIS LABEL (IN QUOTES) OR' 

'A PAIR OF EMPTY QUOTES FOR NO LABEL OR TO USE AXIS 



[60] 

61] 

62] 

63] 

64] 
[65] 

66 ] 

[67] 

" 68 ] 

69] 



■ i i 



OR 



it ti 



'E.G. ' ' LABLE' 

YL+U 

T DO YOU WANT TO RUN THIS PAGE? ' 
'ENTER 0 FOR NO OR 1 FOR YES' 

co+u 

DUM+CMS ' CLRSCRN ' 

■ > 'L^+5xXCO=0 ) 

L3:P<- 1 T 1 
-s-0100 

' PLEASE WAIT RUNNING PAGE' 
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TO 

NO 



TO 

NO 



EXIT 
OR 1 



[70] RUN PAGES AM 

[71] T DO YOU WANT 

[72] 'ENTER 0 FOR 

[73] CO«-0 

[74] + ( CO=l)/LE 

[7 5] 1 DO YOU WANT 

76] 'ENTER 0 FOR 

[77] CO<-0 

[78] DUM<-CMS ' CLRSCRN ' 

[ 79 ] +XVO=11/LB 

[80] 'TEE ONLY THING LEFT 

[81] Lb: 'WITH THE PARTIAL 
CONSTRUCTING ' 

[82] 'ENTER A 3 ELEMENT VECTOR 

[83] 1 1ST ELEMENT, 1(0): LINES 
ON SCREEN' 

[84] 1 2ND ELEMENT, 1(0): HEADER 
SCREEN ' 

[85] '3RD ELEMENT, 1(0): AXES , 

{NOT ) SHOWN' 



THIS FUNCTION? 
FOR YES' 



RESTART THIS FUNCTION? 
OR 1 FOR YES' 



TO DO IS THE AXIS CONTROL' 

PLOT THAT YOU HAVE JUST FINISHED 



FOR 

AND 



PARTIAL 

SYMBOLS 



PLOT' 

ARE {NOT) 



SHOWN 



AND AXES ARE {NOT) SHOWN ON 
GRIDS, AND GRID LINES ARE 



'E.G. 1 1 0 WILL SHOW EVERYTHING ON GRAPH EXCEPT AXES 
GRID LINES' 

P <-0 

T ENTER A 4 ELEMENT VECTOR 
'1ST ELEMENT, X-AXIS : 0 = 



[ 86 ] 

AND 
[87 
[88 

[89] 

7=0 ' 

[90] 

X=0 ' 

[91] 
l=DOTTED, OR 

[92] '4 TH 
1 =DOTTED, OR 

[93] 'E.G. 



FOR AXES 
BOTTOM , 



AND GRID 
2 = TOP , 



CONTROL ' 

OR 20 = AT 



SOLID 
[94 
'95 
96 
[97 
[98 
[99 
[ 100 ] 
[ 101 ] 
[ 102 ] 
'1031 
[104[ 

[105] 

[106] 

[107] 

[108] 

[109] 

[ 110 ] 
SYMBOLS ' 



2ND ELEMENT, Y-AXIS: 1 = LEFT, 3 = RIGHT, OR 21 = AT 

3RD ELEMENT, VERTICAL GRID LINES: 0 =NO GRID, 

2 =SOLID 1 

ELEMENT, HORIZON. GRID LINES: 0 =NO GRID, 

2 =SOLID 1 

2 12 2 WILL DISPLAY AXIS AT TOP AND LEFT AND 



GRID LINES' 

<-0 

L8: 'PLEASE WAIT RUNNING PAGE' 

RUN PAGE SAM 
LA : DUM<-CMS 'CLRSCRN' 

'ENTER'X-AXIS TIC MARKS LOCATION VECTOR' 

'OR ENTER 0 FOR STANDARD TIC MARKS' 

'OR ENTER 1 FOR NO TIC MARKS' 

'E.G. 1 5 11 OR A VECTOR NAME OR 0 
'1 5 11 WILL SHOW TIC MARKS AT X=1 
XT+® 

T ENTER X-AXIS SYMBOLS {IN QUOTES)' 

'OR ENTER 0 WITHOUT 
'OR ENTER 1 WITHOUT 
'E.G. 11 1970 :1971' ' 

XY^0 

CENTER X-AXIS SYMBOLS LOCATIONS VECTOR' 

'OR ENTER 0 FOR SYMBOLS AT DEFAULT LOCATIONS 



OR 1 
, X= 5 



AND X=ll' 



NOTES' FOR STANDARD SYMBOLS' 
UOTES FOR NO SYMBOLS' 

R A VECTOR NAME OR 0 OR 1 ' 



OR NO 



[111 

[ 112 ] 

[113] 

[114 

[115 

[116 

[117 

[118 

[119 

[120 

[ 121 ] 

[ 122 " 

[123] 

[124] 

[125] 
[126" 



6 18 
WILL 



OR A 
SHOW 



VECTOR NAME 
1970 AT X=6 



OR O' 

AND 1971 



'E.G. 

'6 18 
X£H0 

DUM+CMS 'CLRSCRN' 

TENTER' Y-AXIS TIC MARKS LOCATION VECTOR ' 
'OR ENTER 0 FOR STANDARD TIC MARKS' 

'OR ENTER 1 FOR NO TIC MARKS' 

'E.G. 101 OR A VECTOR NAME OR 0 OR 1' 

'101 WILL SHOW TIC MARKS AT Y=~l, Y=0 
Y2V0 

TENTER Y-AXIS SYMBOLS {IN QUOTES)' 



AT X=1 8 ' 



AND Y=1 ' 



'OR ENTER 0 WITHOUT 
'OR ENTER 1 WITHOUT 
'E.G. "LO MID HI ' ' 
YY± 0 

TENTER Y-AXIS SYMBOLS 



UOTES FOR STANDARD SYMBOLS' 
UOTES FOR NO SYMBOLS' 

R VECTOR NAME OR 0 OR 1' 

LOCATIONS VECTOR' 
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[127] 
SYMBOLS ' 



'OR ENTER 0 FOR SYMBOLS AT DEFAULT LOCATIONS OR NO 



[128] 

[129] 

[130] 

[131] 

[132] 

[133] 

[134] 

[135] 

[136] 

[137] 

[138] 

[139] 

[140] 

[141] 

[142] 

[143] 

[144] 
WORK' 

[145] 
CMS' 

[146] 

[147] 
AFTER 

[148] 
’149] 
:i50] 
:i5i] 

V 



"10 1 
1 WILL 



OR VECTOR NAME OR O' 
SHOW LO AT Y=~l, MID 



AT Y= 0, HI AT Y=1 ' 



RERUN THE 
THIS AXIS 



OR 1 FOR YES' 



PLOT INPUTS YOU ENTERED ' 
CONTROL FUNCTION?' 



'I.E. 

'"1 0 
10^0 

T THESE AXIS CONTROL ENTRIES WILL NOW BE RUN' 
RUN PAGE AX 
T D0 YOU WANT TO 
'BEFORE RUNNING 
'ENTER 0 FOR NO 
C0+® 

+(C0=1)/L6 
'DO YOU WANT 
'ENTER 0 FOR 
C0+® 

1 )/L 8 

'DO YOU WANT 
LE : ' IF YOU DO 
'IF YOU EXIT 



TO 

NO 



DO 

OR 



ANOTHER AXIS 
1 FOR YES' 



CONTROL PAGE? 



TO RESTART THE FUNCTION?' 

NOT YOU WILL EXIT THIS FUNCTION' 
THIS FUNCTION AND WANT TO RETAIN 



THIS 



'USE THE KEEP FUNCTION AND THEN YOU CAN RETURN TO 



ENTERING 
YOU WANT 



'BY 

'IF 

EXIT' 

' ENTER 
C0+® 

->(C0=1 ) /LB 
10 " 



)OFF HOLD' 

TO RETURN TO CMS, SIMPLY ENTER )OFF HOLD 



0 FOR EXIT OR 1 FOR RESTART ' 
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APPENDIX C 
APL FUNCTIONS 



1. RANDOM HUMBER GENERATOR 



The following GRAFSTAT [Ref. 12] APL function generates Normal 
random deviates and was used to produce the N(0,1) noise added to 
basic functions used in Chapter IV. 



[1 

[2 

[3 

[4 

[5 

[6 

[7 

[8 

[9 



V Z+N NORRAND P;S;I-,T 
P+2's-P, 1 

Z<-(N)pO 

1+1 

F10 : T <- 2 UNIRAND 0.5 0.5 
T+(2xT)- 1 
S<KT[1] *2 )+T[2] *2 
->F10x i5Sl 

Z[ J]-eFCl] +P[2] x(F[l]x( (~2x®S )*S)*0.5 ) 
-*F lOxi (I+I+l )<N 

V 



The following GRAFSTAT [Ref. 12] APL function generates uniform 
random numbers which are used in the above APL function, 
NORRAND . 

V R+N UNIRAND B 

[1] F-«-(B[l]- B[2] ) + ( </7?100000000 )x2xS[2] )*100000000 

V 



2. EQUAL-WEIGHT MOVING AVERAGE SMOOTHER 

The following GRAFSTAT [Ref. 12] APL function is the Moving 
Average smoother used to generate the associated smooth plots in 
Chapter IV. 

V Y+M MOVAV X 
[1] Y^-(MprM) MAV X 

V 



The following GRAFSTAT [Ref. 12] APL function computes weights 
corresponding to the data values within the neighborhood W and does 
the weighted averaging within W. These values are the smoothed 
values which are transferred to the above APL function. 



Cl] 
[2 ] 

[3] 

[ 4 ] 



V U+W MAV X;D;J',L 
->3- (l = ppV)Av/ 1 2 sppX 
->3 + (£>l )a (L<-pW )<1 tD-<-pX 
->0 ,pGh-' NO CO.' ,U<-' ' 
£>[1]*P[1]+1-L 
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U<-D pJ+O 

U+U+WlJ+J+ll*DpJ"X 
■>•6 y~L> J 
V 



3. COSINEz WEIGHTED MOVING AVERAGE SMOOTHER 



The following APL function computes the cosine weights for the 
cosine-weighted moving average [Ref. 14: p. 394] with window size R, 
i.e. length: 

V W+CW R 

[1] n WEIGHTS FOR A COSINE- WEIGHTED MOVING AVERAGE OF 
LENGTH R 

[2] *Ml- 2o(iR)xo2*E+l)*l?+l 

V 



The following APL function is the Cosine Weighted Moving Average 
algorithm. It is part of the time series APL workspace TSERIES devel- 
oped at the Naval Postgraduate School. 



V S+WW RUNSMOOTH X;L ;W iN \M\LW ; I ;R; IX;IDX 
Cl] RUNNING SMOOTH OF X WITH WEIGHTS W. 

[2] p W MUST BE ODD VECTOR THAT ADDS UP TO 1 

[3] P WW HAS AS 1ST ELEMENT THE ADVANCE STEP 
SMOOTHING WINDOW 



L FOR THE 



C4] 



p RESULT IS 2- ROW MATRIX WITH 
INDICES ( ROW 2) 

L-e-T/1 ,1-HW 
W<r 14- AW 

LW+pW 

IDX+ LO . 5*LW 

+ ((IDXxLW*2 )a ( l = + /AO )/L0 5 
' WEIGHTS W IN RUNSMOOTH NOT 
TERMINATED ' 



SMOOTH VALUES ( ROW 1) AND 



[5] 
'67 
7! 

8: 
97 
10 ] 



PROGR 
11 ] 
12 ] 
13" 
14] 
15' 
16 
17 



ODD OR DONT ADD UP TO 1, 



L05: 

N+pX 
S + 10 

IX-^iO 
R+ l+i LW 
M^N+ 1 -LW 



[18] 

[19] 

[ 20 ] 
[ 21 ] 
[ 22 ] 

[23] 

[24] 

[25] 



T-e-1- L 

L10: + (.M<I+I+L)/L20 
S+S ,+/W*XLI+Rl 
IX*- IX , I +IDX 
+L 10 
L 20 : 

p RETURN SMOOTHED VALUES AND THEIR INDICES IN ROW 2 
S+(2 ,pS)pS, IX 
V 
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APPENDIX D 



DATA SETS 

This appendix contains in tabular form the data sets used in this 
thesis. The first table shows the Daily Sea-Surface Temperature in 
degrees Centigrade collected at Granite Canyon, just south of Point 
Sur, California. This data set used in Figures 1.1 and 1.2 which were 
illustrations of a scatterplot and a smooth curve through the scatter- 
plot. The other three tables contain the Test Data sets used in the 
evaluation of the Supersmoother and the Split Linear Fit smoothers, i.e. 
Test Set One, Test Set Two, and Test Set Three. 
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TABLE 19 



DAILY SEA- SURFACE TEMPERATURE IN DEGREE CENTIGRADE 
AT GRANITE CANYON, CALIFORNIA 



TEMP . 


DATE 


TEMP . DATE 


TEMP . DATE 


TEMP . DATE 


TEMP . DATE 


TEMP . 


DATE 


08.8 


1060 


08.8 


1111 


09.7 


1162 


10.6 


1213 


13.7 


1264 


13.2 


1315 


08.5 


1061 


08.5 


1112 


09.9 


1163 


10.7 


1214 


14.0 


1265 


12.4 


1316 


08.5 


1062 


09.3 


1113 


10.1 


1164 


10.7 


1215 


14.4 


1266 


12.5 


1317 


09.0 


1063 


09.1 


1114 


09.8 


1165 


10.8 


1216 


14.0 


1267 


12.0 


1318 


08.8 


1064 


09.0 


1115 


09.8 


1166 


10.9 


1217 


14.8 


1268 


11.4 


1319 


08.3 


1065 


09.0 


1116 


09.5 


1167 


11.2 


1218 


14.0 


1269 


10.8 


1320 


08.9 


1066 


09.6 


1117 


09.8 


1168 


11.5 


1219 


13.8 


1270 


11.3 


1321 


09.2 


1067 


09.0 


1118 


09.8 


1169 


11.0 


1220 


13.1 


1271 


11.1 


1322 


09.2 


1068 


09.0 


1119 


10.2 


1170 


11.3 


1221 


12.7 


1272 


11.3 


1323 


08.9 


1069 


09.3 


1120 


10.5 


1171 


11.3 


1222 


12.8 


1273 


11.9 


1324 


09.8 


1070 


09.5 


1121 


09.9 


1172 


10.8 


1223 


11.8 


1274 


11.9 


1325 


09.8 


1071 


10.8 


1122 


09.9 


1173 


10.9 


1224 


11.6 


1275 


12.1 


1326 


09.8 


1072 


10.8 


1123 


10.0 


1174 


11.0 


1225 


11.8 


1276 


11.7 


1327 


09.9 


1073 


10.6 


1124 


09.8 


1175 


11.2 


1226 


11.8 


1277 


11.7 


1328 


09.6 


1074 


10.4 


1125 


09.7 


1176 


11.8 


1227 


12.2 


1278 


11.5 


1329 


09.5 


1075 


09.8 


1126 


10.2 


1177 


12.1 


1228 


11.9 


1279 


11.0 


1330 


09.6 


1076 


09.8 


1127 


09.9 


1178 


11.5 


1229 


11.2 


1280 


11.5 


1331 


09.3 


1077 


10.3 


1128 


10.0 


1179 


11.7 


1230 


11.3 


1281 


11.5 


1332 


09.8 


1078 


10.8 


1129 


10.0 


1180 


11.9 


1231 


11.5 


1282 


11.9 


1333 


10.4 


1079 


10.9 


1130 


09.9 


1181 


12.3 


1232 


11.0 


1283 


11.2 


1334 


10.9 


1080 


10.0 


1131 


09.9 


1182 


12.5 


1233 


10.5 


1284 


10.9 


1335 


10.8 


1081 


10.0 


1132 


09.8 


1183 


12.8 


1234 


10.4 


1285 


11.2 


1336 


10.0 


1082 


10.0 


1133 


09.8 


1184 


11.8 


1235 


11.0 


1286 


10.8 


1337 


10.0 


1083 


09.7 


1134 


09.9 


1185 


11.9 


1236 


11.2 


1287 


10.9 


1338 


10.0 


1084 


10.0 


1135 


09.5 


1186 


14.0 


1237 


11.1 


1288 


10.7 


1339 


11.0 


1085 


09.6 


1136 


09.9 


1187 


13.9 


1238 


11.8 


1289 


10.0 


1340 


10.3 


1086 


08.9 


1137 


09.9 


1188 


13.4 


1239 


12.6 


1290 


09.6 


1341 


09.7 


1087 


08.9 


1138 


10.4 


1189 


13.0 


1240 


13.2 


1291 


09.2 


1342 


09.9 


1088 


08.9 


1139 


09.9 


1190 


13.2 


1241 


14.0 


1292 


09.6 


1343 


09.9 


1089 


09.2 


1140 


10.5 


1191 


12.0 


1242 


13.9 


1293 


09.7 


1344 


08.8 


1090 


08.9 


1141 


10.2 


1192 


12.0 


1243 


12.2 


1294 


09.8 


1345 


08.9 


1091 


09.2 


1142 


10.0 


1193 


12.0 


1244 


12.1 


1295 


09.4 


1346 


09.1 


1092 


09.4 


1143 


09.2 


1194 


11.9 


1245 


11.8 


1296 


09.8 


1347 


09.3 


1093 


10.4 


1144 


09.5 


1195 


10.6 


1246 


11.3 


1297 


09.8 


1348 


09.0 


1094 


10.0 


1145 


09.8 


1196 


10.5 


1247 


11.3 


1298 


09.7 


1349 


09.3 


1095 


09.8 


1146 


10.0 


1197 


12.6 


1248 


11.5 


1299 


09.8 


1350 


09.6 


1096 


10.1 


1147 


11.2 


1198 


12.5 


1249 


10.7 


1300 


09.5 


1351 


09.8 


1097 


10.2 


1148 


11.2 


1199 


11.1 


1250 


09.9 


1301 


10.0 


1352 


09.8 


1098 


10.0 


1149 


11.7 


1200 


10.8 


1251 


09.5 


1302 


09.9 


1353 


09.8 


1099 


09.3 


1150 


11.8 


1201 


11.0 


1252 


10.0 


1303 


09.9 


1354 


10.0 


1100 


09.2 


1151 


11.2 


1202 


11.0 


1253 


10.8 


1304 


10.0 


1355 


09.8 


1101 


08.9 


1152 


11.2 


1203 


10.3 


1254 


09.6 


1305 


10.6 


1356 


09.0 


1102 


09.0 


1153 


11.2 


1204 


11.2 


1255 


10.3 


1306 


10.7 


1357 


09.3 


1103 


09.2 


1154 


11.2 


1205 


11.7 


1256 


10.8 


1307 


11.0 


1358 


10.9 


1104 


09.4 


1155 


10.9 


1206 


12.0 


1257 


11.0 


1308 


10.8 


1359 


09.7 


1105 


09.2 


1156 


10.9 


1207 


15.0 


1258 


11.0 


1309 


10.5 


1360 


09.0 


1106 


09.2 


1157 


10.9 


1208 


15.8 


1259 


11.5 


1310 


10.8 


1361 


09.4 


1107 


09.4 


1158 


10.6 


1209 


14.5 


1260 


12.2 


1311 


10.7 


1362 


09.0 


1108 


09.5 


1159 


10.4 


1210 


13.3 


1261 


11.2 


1312 


10.7 


1363 


09.1 


1109 


09.5 


1160 


10.3 


1211 


14.0 


1262 


12.4 


1313 


10.7 


1364 


09.2 


1110 


09.7 


1161 


11.4 


1212 


13.8 


1263 


11.8 


1314 


10.7 


1365 
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TABLE 20 

TEST SET ONE 



X 


Y 


X 


Y 


01 


-0.5021245 


51 


3.5978136 


02 


-2.2706658 


52 


-0.3642989 


03 


1.2849535 


53 


1.0218046 


04 


0.7868391 


54 


3.2826479 


05 


0.2750802 


55 


2.7816672 


06 


0.5916070 


56 


1.9686457 


07 


-0.4165126 


57 


2.6941914 


08 


2.0161424 


58 


1.3421858 


09 


0.4190598 


59 


1.3385338 


10 


2.1970718 


60 


2.7176880 


11 


0.7040685 


61 


3.9561078 


12 


1.3516733 


62 


3.2294325 


13 


-0.9261715 


63 


2.0122998 


14 


-0.1411653 


64 


3.4452995 


15 


1.8459821 


65 


2.3519064 


16 


0.0010230 


66 


1.9137510 


17 


1 .2573502 


67 


2.2349597 


18 


0.3599704 


68 


2.1070889 


19 


0.6244237 


69 


2.5508559 


20 


-0.5493385 


70 


3.3621478 


21 


-0.4304499 


71 


1.7760771 


22 


1.8645709 


72 


3.2315889 


23 


0.8751194 


73 


4.0529999 


24 


0.1610555 


74 


3.1099944 


25 


0.2348275 


75 


3.7031440 


26 


1.9017348 


76 


2.9875884 


27 


1 .0237749 


77 


5.0984962 


28 


1.6334782 


78 


4.0441594 


29 


1.5566808 


79 


1.3458853 


30 


1.9562190 


80 


3.2349733 


31 


1.6404860 


81 


1.4321380 


32 


-0.0613807 


82 


4.3081925 


33 


1.6950412 


83 


3.7146003 


34 


2.4851618 


84 


3.9994056 


35 


2.1286416 


85 


4.2742129 


36 


-0.9374542 


86 


5.1924022 


37 


1.2062176 


87 


3.1599492 


38 


1.1970600 


88 


3.3825862 


39 


1.8779879 


89 


4.1757696 


40 


1.0888278 


90 


4.5778941 


41 


1.6379587 


91 


2.5246523 


42 


3.2865109 


92 


3.1299786 


43 


2.5676486 


93 


3.7598849 


44 


2.0281008 


94 


1.2771573 


45 


0.8765109 


95 


4.9586547 


46 


1.7695006 


96 


2.8137205 


47 


2.0278913 


97 


5.0334870 


48 


1.3629063 


98 


3.1335435 


49 


1.6232943 


99 


4.5948086 


50 


2.4152275 


100 


4.8204097 



Y X Y 



3.8639735 


151 


4.5123774 


4.6424739 


152 


7.0449390 


3.9925571 


153 


6.6461625 


4.7141715 


154 


5.9755457 


3.1026557 


155 


6.3987827 


3.5086986 


156 


7.8304268 


5.6847319 


157 


6.3121917 


3.6997858 


158 


7.2091639 


4.5971487 


159 


5.0608475 


2.8131531 


160 


7.5825731 


4.0385152 


161 


8.2574717 


3.7093077 


162 


5.8956979 


4.2573194 


163 


5.5093262 


5.5364895 


164 


5.5995017 


5.5778150 


165 


7.2911596 


5.8100211 


166 


5.8813818 


4.8393109 


167 


6.5830282 


5.2195209 


168 


5.3130510 


3.7046249 


169 


7.7908127 


4 3492568 


170 


6.0401256 


6.1103783 


171 


7.7141272 


5.7786937 


172 


7.6411270 


5.3587051 


173 


6.7540765 


3.7126557 


174 


7.2609074 


5.3246669 


175 


6.6775327 


5.4300705 


176 


6.6715890 


4.6748617 


177 


8.2278953 


5.4123149 


178 


7.1614303 


7.7259102 


179 


6.0619503 


4.7421844 


180 


8.7667237 


3.6291729 


181 


6.1915765 


2.6104761 


182 


7.7027237 


4.5603033 


183 


6.4755850 


4.6852791 


184 


7.0483700 


4.2283128 


185 


7.7978105 


7.4729300 


186 


8.4897860 


6.4484810 


187 


7.1392044 


6.1902920 


188 


7.8562970 


5.9801460 


189 


7.3386392 


2.7272488 


190 


7.6166495 


7.3258741 


191 


6.4476388 


5.7079245 


192 


7.5483537 


4.8179694 


193 


9.2075244 


3.7067425 


194 


6.9231787 


5.8890863 


195 


6.4990181 


7.9270991 


196 


9.4141318 


6.2451185 


197 


9.0460399 


6.7661055 


198 


8.7064297 


5.7754624 


199 


6.4983612 


7.0307144 


200 


9.5544659 



X 

101 

102 

103 

104 

105 

106 

107 

108 

109 

110 

111 

112 

113 

114 

115 

116 

117 

118 

119 

120 

121 

122 

123 

124 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 

149 

150 



150 







TABLE 


21 










TEST 


SET 


TWO 






X 


Y 


X Y 


X 


Y 


X 


Y 


01 


0.4570756 


51 1.1056374 


101 


-0.7988832 


151 


-0.5570467 


02 


-1.3538641 


52 -2.9317810 


102 


-0.0285995 


152 


1.9443678 


03 


2.1577621 


53 -1.6202036 


103 


-0.6857874 


153 


1.5128773 


04 


1.6140663 


54 0.5669487 


104 


0.0294491 


154 


0.8079680 


05 


1.0551468 


55 -0.0068339 


105 


-1.5876051 


155 


1.1953248 


06 


1.3229449 


56 -0.8917159 


106 


-1.1863163 


156 


2.5894945 


07 


0,2645429 


57 -0.2370383 


107 


0.9856908 


157 


1.0321866 


08 


2.6453778 


58 -1.6588701 


108 


-1.0026112 


158 


1.8884863 


09 


0.9949566 


59 -1.7312588 


109 


-0.1079922 


159 


-0.3021013 


10 


2.7181328 


60 -0.4197057 


110 


-1.8941798 


160 


2.1757581 


11 


1.1688202 


61 0.7522926 


111 


-0.6705179 


161 


2.8052014 


12 


1.7586682 


62 -0.0395823 


112 


-1.0009952 


162 


0.3963923 


13 


-0.5783523 


63 -1.3206523 


113 


-0,4538849 


163 


-0.0385836 


14 


0.1460898 


64 0.0497107 


114 


0.8246897 


164 


0.0014325 


15 


2.0713177 


65 -1.1049823 


115 


0.8656625 


165 


1.6413922 


16 


0.1631187 


66 -1.6030669 


116 


1.0976953 


166 


0.1783960 


17 


1.3549229 


67 -1.3403846 


117 


0.1269276 


167 


0.8253250 


18 


0.3917761 


68 -1.5253494 


118 


0.5071318 


168 


-0.5008452 


19 


0.5892597 


69 -1.1372168 


119 


-1.0077820 


169 


1.9192740 


20 


-0.6526318 


70 -0.3800745 


120 


-0.3632442 


170 


0.1095231 


21 


-0.6029871 


71 -2.0187875 


121 


1.3976432 


171 


1.7230700 


22 


1.6217221 


72 -0.6143904 


122 


1.0655210 


172 


1.5882570 


23 


0.5609396 


73 0.1574513 


123 


0.6448279 


173 


0.6380705 


24 


-0.2254246 


74 -0.8335629 


124 


-1.0022558 


174 


1.0804791 


25 


-0.2248702 


75 -0.2868485 


125 


0.6083290 


175 


0.4314349 


26 


1.3679551 


76 -1.0472555 


126 


0.7118525 


176 


0.3586158 


27 


0.4151033 


77 1.0203924 


127 


-0.0457512 


177 


1.8468834 


28 


0.9491606 


78 -0.0756075 


128 


0.6887321 


178 


0.7112616 


29 


0.7960204 


79 -2.8139453 


129 


2.9987233 


179 


-0.4584467 


30 


1.1185767 


80 -0.9633214 


130 


0.0107011 


180 


2.1750750 


31 


0.7252823 


81 -2.8030239 


131 


-1.1073564 


181 


-0.4722970 


32 


-1.0546655 


82 0.0377556 


132 


-2.1319047 


182 


0.9657036 


33 


0.6232166 


83 -0.5895273 


133 


-0.1887889 


183 


-0.3354497 


34 


1.3344005 


84 -0.3368381 


134 


-0.0714380 


184 


0.1625056 


35 


0.8986088 


85 -0.0925853 


135 


-0.5369943 


185 


0.8363578 


36 


-2.2470305 


86 0.7965962 


136 


2.6980175 


186 


1 .4520438 


37 


-0.1831108 


87 -1.2633354 


137 


1.6628992 


187 


0.0245292 


38 


-0.2721655 


88 -1.0666680 


138 


1.3929299 


188 


0.6641043 


39 


0.3287841 


89 -0.2979676 


139 


1.1698476 


189 


0.0684045 


40 


-0.5403718 


90 0.0811356 


140 


-2.0971853 


190 


0.2679094 


41 


-0.0711898 


91 -1.9936927 


141 


2.4860636 


191 


-0.9800088 


42 


1.4975242 


92 -1.4085479 


142 


0.8514574 


192 


0.0414590 


43 


0.6989979 


93 -0.7974495 


143 


-0.0564717 


193 


1.6211055 


44 


0.0800240 


94 -3.2976455 


144 


-1.1870254 


194 


-0.7429779 


45 


-1.1506912 


95 0.3676870 


145 


0.9746058 


195 


-1.2470265 


46 


-0.3364633 


96 -1.7921467 


146 


2.9904894 


196 


1.5881130 


47 


-0.1564088 


97 0.4139455 


147 


1.2849344 


197 


1.1400245 


48 


-0.8992433 


98 -1.4984893 


148 


1.7808755 


198 


0.7204593 


49 


-0.7161574 


99 -0.0485762 


149 


0.763691 


199 


-1.5674586 


50 


-0.0009194 


100 0.1667661 


150 


1.9908847 


200 


1.4089659 
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TABLE 22 

TEST SET THREE 



X 


Y 


X 


Y 


01 


0.5178755 


51 


5.4778136 


02 


-1.2306658 


52 


1.3957011 


03 


2.3449535 


53 


2.6618046 


04 


1.8668391 


54 


4.8026479 


05 


1.3750802 


55 


4.1816672 


06 


1.7116070 


56 


3.2486457 


07 


0.7234874 


57 


3.8541914 


08 


3.1761424 


58 


2.3821858 


09 


1.5990598 


59 


2.2585338 


10 


3.3970718 


60 


3.5176880 


11 


1.9240685 


61 


4.6361078 


12 


2.5916733 


62 


3.7894325 


13 


0.3338285 


63 


2.4522998 


14 


1.1388347 


64 


3.7652995 


15 


3.1459821 


65 


2.5519064 


16 


1.3210230 


66 


1.9937510 


17 


2.5973502 


67 


2.1949597 


18 


1.7199704 


68 


1.9470889 


19 


2.0044237 


69 


2.2708559 


20 


0.8506615 


70 


2.9621478 


21 


0.9895501 


71 


1.2560771 


22 


3.3045709 


72 


2.5915889 


23 


2.3351194 


73 


3.2929999 


24 


1.6410555 


74 


2.2299944 


25 


1.7348275 


75 


2.7031440 


26 


3.4217348 


76 


1.8675884 


27 


2.5637749 


77 


3.8584962 


28 


3.1934782 


78 


2.6841594 


29 


3.1366808 


79 


-0.1341147 


30 


3.5562190 


80 


1.6349733 


31 


3.2604860 


81 


-0.2878620 


32 


1.5786193 


82 


2.4681925 


33 


3.3550412 


83 


1.7546003 


34 


4.1651618 


84 


1.9194056 


35 


3.8286416 


85 


2.0742129 


36 


0.7825458 


86 


2.8724022 


37 


2.9462176 


87 


0.7199492 


38 


2.9570600 


88 


0.8225862 


39 


3.6579879 


89 


1.4957696 


40 


2.8888278 


90 


1.7778941 


41 


3.4579587 


91 


-0.3953477 


42 


5.1265109 


92 


0.0899786 


43 


4.4276486 


93 


0.5998849 


44 


3.9081008 


94 


-2.0028427 


45 


2.7765109 


95 


1.5586547 


46 


3.6895006 


96 


-0.7062795 


47 


3.9678913 


97 


1.3934870 


48 


3.3229063 


98 


-0.6264565 


49 


3.6032943 


99 


0.7148086 


50 


4.4152275 


100 


0.8204097 



Y X Y 



-0.1760265 


151 


-1.1356226 


0.5704739 


152 


1.3569390 


-0.1114429 


153 


0.9181625 


0.5781715 


154 


0.2075457 


-1.0653443 


155 


0.5907827 


-0.6913014 


156 


1.9824268 


1.4527319 


157 


0.4241917 


-0.5642142 


158 


1.2811639 


0.3011487 


159 


-0.9071525 


-1.5148469 


160 


1.5745731 


-0.3214848 


161 


2.2094717 


-0.6826923 


162 


-0.1923021 


-0.1666806 


163 


-0.6186738 


1.0804895 


164 


-0.5684983 


1.0898150 


165 


1.0831596 


1.2900211 


166 


-0.3666182 


0.2873109 


167 


0.2950282 


0.6355209 


168 


-1.0149490 


-0.9113751 


169 


1.4228127 


-0.2987432 


170 


-0.3678744 


1.4303783 


171 


1.2661272 


1.0666937 


172 


1.1531270 


0.6147051 


173 


0.2260765 


-1.0633443 


174 


0.6929074 


0.5166669 


175 


0.0695327 


0.5900705 


176 


0.0235890 


-0.1971383 


177 


1.5398953 


0.5083149 


178 


0.4334303 


2.7899102 


179 


-0.7060497 


-0.2258156 


180 


1.9587237 


-1.3708271 


181 


-0.6564235 


-2.4215239 


182 


0.8147237 


-0.5036967 


183 


-0.4524144 


-0.4107209 


184 


0.0803703 


-0.8996872 


185 


0 7898105 


2.3129300 


186 


1.4417860 


1.2564810 


187 


0.0512044 


0.9662920 


188 


0.7282970 


0.7241460 


189 


0.1706392 


-2.5607512 


190 


0.4086495 


2.0058741 


191 


-0.8003612 


0.3559245 


192 


0.2603537 


-0.5660306 


193 


1.8795244 


-1.7092575 


194 


-0.4448213 


0.4410863 


195 


-0.9089819 


2.4470991 


196 


1.9661318 


0.7331185 


197 


1.5580399 


1.2221055 


198 


1.1784297 


0.1994624 


199 


-1.0696388 


1.4227144 


200 


1.9464659 



X 

101 

102 

103 

104 

105 

106 

107 

108 

109 

110 

111 

112 

113 

114 

115 

116 

117 

118 

119 

120 

121 

122 

123 

124 

125 

126 

127 

128 

129 

130 

131 

132 

133 

134 

135 

136 

137 

138 

139 

140 

141 

142 

143 

144 

145 

146 

147 

148 

149 

150 



152 



APPENDIX E 

SAMPLE SESSION USING SUPSMO PROGRAM 



The following is a computer session showing the interaction between 
the smoothing program SUPSMO and a user. The steps necessary to 
run the smoothing program SPTLIN are very similar to those in this 
session. It should be remembered that SPTLIN requires a terminal 
which can access 2M of computer storage memory. 



sups mo 



YOU HAVE INITIATED AN ALGORITHM TO SMOOTH A SET OF DATA 
USING THE 

ALGORITHM " SUPER SMOOTHER" DEVELOPED BY FRIEDMAN AND 
STUETZLE OF 

STANFORD UNIVERSITY DEPT. OF STATISTICS 

IF GRAPHICS WILL NOT BE USED DEFINE STORAGE AS 1024 K 
BY ENTERING ' DEF STOR 1024K' 

FOLLOWED BY ' I CMS ' , 

THEN BY ' SUPSMO ' 

DO YOU WISH TO CONTINUE? 

ENTER Y FOR YES OR ANY OTHER KEY TO EXIT : 



Y 



IN ORDER TO USE THIS ALGORITHM YOU MUST HAVE ON HAND THE 
FOLLOWING : 

1. FILENAME OF DATA FILE ( FILETYPE DATA) WITH DATA TO BE 
SMOOTHED 

2. IF DATA POINTS ARE NOT IN CHRONOLOGICAL ORDER, YOU NEED 
TO 

HAVE A FILE ( FILETYPE ORDER) WITH INDICES CORRESPONDING TO 
DATA 

POINTS INDICATING THE ORDER OF THE DATA POINTS. 

3 . FILENAME OF DATA FILE WHERE SMOOTHED OUTPUT WILL BE 
WRITTEN 

OR IF YOU WANT TO WRITE OUTPUT INTO APL HAVE ON HAND 
THE VARIABLE AND WORKSPACE NAMES THAT WILL STORE THE 
OUTPUT . 

4. IF YOU WANT TO SMOOTH THE DATA USING ONLY ONE WINDOW 
SIZE, 

HAVE ON HAND THE DECIMAL FRACTION OF THE DATA TO BE USED. 

5. IF YOU WANT TO SMOOTH THE DATA USING THREE WINDOW SIZES, 
HAVE ON HAND THE THREE DECIMAL FRACTIONS OF THE DATA TO BE 

USED. 

DO YOU WISH TO CONTINUE? 

ENTER Y FOR YES OR ANY OTHER KEY TO EXIT: 
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Y 



ENTER FILENAME OF FILE WHICH CONTAINS THE DATA TO BE 
SMOOTHED : 



water 



ARE DATA POINTS TO BE SMOOTHED IN CHRONOLOGICAL ORDER? 
ENTER Y FOR YES OR N FOR NO: 



Y 



THE DATA YOU WANT TO SMOOTH IS IN WATER DATA 
WHERE DO YOU WANT TO WRITE THE SMOOTHED OUTPUT? CMS OR APL? 
YOU CAN PLOT THE SMOOTHED OUTPUT IF YOU ARE LOGGED ON 
A TERMINAL THAT CAN ACCESS GRAF ST AT , I.E. HAVE 2 M OF 
STORAGE 

BUT THE OUTPUT MUST BE STORED IN AN APL VARIABLE 
ENTER APL OR CMS: 



apl 



NOT USING THE NAME OF THE FILE WITH THE INPUT DATA, WATER 
ENTER THE NAME OF THE APL VARIABLE THAT WILL STORE THE 
OUTPUT : 



smuflg 53 



DO YOU WANT TO PLOT THE OUTPUT? 
ENTER Y FOR YES OR N FOR NO: 



y 



CAN YOU ACCESS 2 M OF STORAGE ON THIS DISK (TERMINAL)? 
ENTER Y FOR YES OR N FOR NO: 



y 



PLEASE READ THE FOLLOWING INSTRUCTIONS VERY CAREFULLY 
ARE YOU READY TO START THE SUPER SMOOTHING PROGRAM? 
ENTER Y FOR YES OR ANY OTHER KEY TO EXIT: 



y 



*****please wait the smoothing program is being 

COMPILED********** 

VS FORTRAN COMPILER ENTERED. 16 : 11:30 
**MAIN** END OF COMPILATION 1 ****** 

**SUPSMU** END OF COMPILATION 2 ****** 
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**SMOOTH** END OF COMPILATION 3 ****** 

**SORTER** END OF COMPILATION 4 ****** 

**BLKDT #** END OF COMPILATION 5 ****** 
VS FORTRAN COMPILER EXITED. 16:11:36 



*****PL£4SF WAIT SMOOTHING PROGRAM IS BEING 
LOADED ***************** 

EXECUTION BEGINS . . . 

ENTER THE NUMBER OF DATA POINTS TO BE SMOOTHED- -- INTEGER 
VALUE 



671 



ARE THE INPUT DATA POINTS IN CHRONOLOGICAL ORDER ? 
ENTER 0 FOR NO OR 1 FOR YES 



1 



ENTER 1.0 IF YOU DESIRE TO USE ONLY ONE SPAN VALUE 
ENTER 0.0 IF YOU DESIRE TO USE THREE SPAN VALUES 

0.0 



ENTER THE LOWEST SPAN VALUE: 

FRACTION OF 671 I.E. A REAL NUMBER BETWEEN 0.0 AND 1.0 



0 . 00745 



ENTER THE MIDDLE SPAN VALUE : 

FRACTION OF 671 I.E. A REAL NUMBER BETWEEN 0.0 AND 1.0 



0.016393 



ENTER THE HIGHEST SPAN VALUE: 

FRACTION OF 671 I.E. A REAL NUMBER BETWEEN 0.0 AND 1.0 



0.0175 



IF ONE OF THE SPAN VALUES IS SMALL 
I.E. RESULTS IN A SMALL WINDOW SIZE (10 OR LESS) 
YOU MAY WISH TO ADJUST THE ROBUSTNESS 
BY ENTERING A REAL NUMBER GT 0.0 BUT LT 10.0 
OR FOR NO ROBUST ADJUSTMENT ENTER 0.0 
ENTER YOUR CHOICE 



0.0 



*****please wait smoothing program now running ****** 

****PLEASE WAIT, LINKING TO GMFSMr***************** 
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N (193 ) R/0 
M (194) R/0 
F (391) R/0 

VS A P L 4.0 

CLEAR WS 

SAVED 06:26:45 09/01/85 
WSSIZE IS 1188956 

CMS MATRIX IS EOT RECTANGULAR . ROW ONE HAS 5 ELEMENTS 
ROW 135 HAS 1 ELEMENT (S) 

INFORMATION TRANSFER HAS STOPPED AT THIS LINE 

16:12:42 09/04/85 SUPSMO 
SAVED 15:18:53 05/09/85 
WSSIZE IS 1188956 

THIS IS THE 4/01/8 5 RELEASE OF GRAF ST AT . IT RUNS ON THE 

3277 /GA OR 

ON THE 3278/79. CONTROL VECTORS FROM EARLIER RELEASES WILL 
CONTINUE 

TO RUN. IF YOU )COPY RATHER THAN )LOAD THIS WORKSPACE YOU 
MUST 

EXECUTE THE FUNCTIONL TENT BEFORE STARTING . THE NEXT 
RELEASE IS ~ 

SCHEDULED FOR 9/85. 

TO BEGIN , TYPE: START 

FOR MORE INFORMATION , TYPE: DESCRIBE 

NOT COPIED: RCODE GET XBLANKS VC AT 
SAVED 16 : 12 :^2“Ug/04/85 

THEENTIRE DATA FILE THAT YOU WANTED SMOOTHED HAS BEEN 
TRANSFERRED 

TO THIS WORKSPACE SO THAT YOU MAY BE ABLE TO PLOT BOTH 
THE SMOOTHED AND UNSMOOTHED DATA. 

THE UNSMOOTHED DATA IS IN THE VARIABLE WITH THE SAME NAME 
AS 

THE DATA FILE THAT YOU HAVE YOUR INPUT DATA IN. 



DO YOU WISH TO GO INTO APL OR CONTINUE? 
ENTER 0 FOR APL OR 1 FOR CONTINUE 0 : 



1 



YOU HAVE ACTIVATED THE PLOTTING FUNCTION 

IT IS ASSUMED THAT THE USER IS FAMILIAR WITH THE GRAFSTAT 
PLOT FUNC. 

AND THE AXIS CONTROL FUNCTION 

IF YOU RECEIVE (MAKE) AN ERROR MESSAGE DO THE FOLLOWING 

1. ENSURE THAT VM READ IS DISPLAYED IN LOWER RIGHT CORNER 
OF SCREEN 

2. PRESS THE ENTER KEY 

3. ENTER P AG 

TO UNDERSCORE A~ LETTER HOLD THE APL/ ALT KEY DOWN AND PRESS 
THE LETTER 

THE PLOTTING FUNCTION WILL RESTART AT THE BEGINNING 
THE PLOTTING FUNCTION CAN BE EXITED AT ANY INPUT POINT BY 
ENTERING 

AT ANYTIME THAT YOU EXIT THE PLOTTING FUNCTION 
YOU WILL BE IN THE GRAFSTAT WORKSPACE 
IF YOU WISH TO RETURN TO CMS ENTER 
)OFF HOLD 
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ENTER X VARIABLE (S) ( ENCLOSED IN QUOTES ), IF ENTERING MORE 
THAN ONE VARIABLE 

SEPARATE VARIABLES WITH SEMICOLON AND USE QUOTES 
E.G. ' X ' OR ' XI ; X2 ' 0 : 



* ( 670 ) + 59 ' 



ENTER Y VARIABLE (S ) ( ENCLOSED IN QUOTES AND MUST BE OF SAME 
LENGTH AS X) 

IF ENTERING MORE THAN ONE VARIABLE . SEPARATE WITH SEMICOLON 
AND REMEMBER TO USE QUOTES ENCLOSING ENTIRE STRING 
E.G. 1 Y ' OR 1 Y1 ; Y2 1 0 : 



'670 WATER •, SMUFIG53 * 



ENTER A VECTOR INDICATING TYPE(S) OF PLOT ; 0=SYM ONLY; 
INLINE ONLY 

E.G. 0 OR 1 OR 0 1 OR 0 0 OR 1 0 OR 110 : 



0 1 



ENTER TYPE OF SYMBOL CORRESPONDING TO EACH SYMBOLS ONLY 
PLOT UN QUOTES) 

E.G. '.' OR YOU CAN USE .* + 0 : 



i i 



ENTER A VECTOR INDICATING TYPE(S) OF LINES; 1 =SOLID LINE; 
3 =DASH LINE 

E.G. 1 OR 3 OR 1 3 OR ANY OTHER COMBINATION OR LINE TYPES 
IN GRAF ST AT 0 : 



1 



ENTER SCALE OF X-AXIS (IN QUOTES) OR P (IN QUOTES) FOR 
PREVIOUS SCALE 

E.G. 'LIN' OR 'LIN XMIN XMAX' OR 'P' 0 : 



1 LIN . 1 7 35' 



ENTER SCALE OF Y-AXIS (IN QUOTES) OR P (IN QUOTES) FOR 
PREVIOUS SCALE 

E.G. 'LIN' OR 'LIN YMIN YMAX 1 OR 'P' 0 : 



'LIN 8 17' 



ENTER THE PLOT HEADER (IN QUOTES) OR EMPTY QUOTES 
E.G. 'TITLE' OR ' ' 0 : 



'SMOOTHING WITH SUPERSMOOTHER, ALPHA= 0.0- SPAN(S)= 
0.00745, 0.016393, 0.0175' 
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ENTER X-AXIS LABEL (IN QUOTES) OR 

A PAIR OF EMPTY QUOTES FOR NO LABEL OR TO USE AXIS CONTROL 
E.G. 'LABEL' OR ' ' S3 : 



'JULIAN CALENDAR DATE' 



ENTER Y-AXIS LABEL (IN QUOTES) OR 

A PAIR OF EMPTY QUOTES FOR NO LABEL OR TO USE AXIS CONTROL 
E.G. 'LABLE' OR ' ' 0 : 



'TEMPERATURE IN DEGREES CENT.' 



DO YOU WANT TO RUN THIS PAGE? 
ENTER 0 FOR NO OR 1 FOR YES 0 : 



0 



DO YOU WANT TO EXIT THIS FUNCTION ? 
ENTER 0 FOR NO OR 1 FOR YES 



0 : 

0 



DO YOU WANT TO RESTART THIS FUNCTION? 
ENTER 0 FOR NO OR 1 FOR YES 0 : 



0 



THE ONLY THING LEFT TO DO IS THE AXIS CONTROL 
WITH THE PARTIAL PLOT THAT YOU HAVE JUST FINISHED 

CONSTRUCTING 

ENTER A 3 ELEMENT VECTOR FOR PARTIAL PLOT 

1ST ELEMENT, 1(0): LINES AND SYMBOLS ARE (NOT) SHOWN ON 

SCREEN 

2ND' ELEMENT, 1(0): HEADER AND AXES ARE (NOT) SHOWN ON 

SCREEN 

3RD ELEMENT, 1(0): AXES, GRIDS, AND GRID LINES ARE (NOT) 

SHOWN 

E.G. 1 1 0 WILL SHOW EVERYTHING ON GRAPH EXCEPT AXES AND 
GRID LINES 0 : 



110 



ENTER A 4 ELEMENT VECTOR FOR AXES AND GRID CONTROL 
1ST ELEMENT, X-AXIS: 0 = BOTTOM, 2 = TOP , OR 20 s AT Y = 0 

2ND ELEMENT, Y-AXIS: 1 = LEFT, 3 = RIGHT, OR 21 = AT X=0 

3RD ELEMENT, VERTICAL GRID LINES: 0 =NO GRID, l=DOTTED , OR 

2 =SOLID 

4 TH ELEMENT, HORIZON. GRID LINES: 0 =NO GRID, 1 ^DOTTED, OR 

2 =SOLID 

E.G. 2 1 2 2 WILL DISPLAY AXIS AT TOP AND LEFT AND SOLID 
GRID LINES 0 : 



0 10 0 



PLEASE WAIT RUNNING PAGE 
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ENTER X-AXIS TIC MARKS LOCATION VECTOR 

OR ENTER 0 FOR STANDARD TIC MARKS 

OR ENTER 1 FOR NO TIC MARKS 

E.G. 1 5 11 OR A VECTOR NAME OR 0 OR 1 

1 5 11 WILL SHOW TIC MARKS AT X=1 , X=5, AND X=ll 0 : 



0 31 59 90 120 151 181 212 243 273 304 334 365 396 
424 455 485 516 546 577 608 638 669 699 730 



ENTER X-AXIS SYMBOLS (IN QUOTES ) 

OR ENTER 0 WITHOUT QUOTES FOR STANDARD SYMBOLS 
OR ENTER 1 WITHOUT QUOTES FOR NO SYMBOLS 
E.G. '1970; 1971' OR A VECTOR NAME OR 0 OR 1 0 : 



'1080;1172;1264;1355;2080;2172;2264;2355' 



ENTER X-AXIS SYMBOLS LOCATIONS VECTOR 

OR ENTER 0 FOR SYMBOLS AT DEFAULT LOCATIONS OR NO SYMBOLS 
E.G. 6 18 OR A VECTOR NAME OR 0 

6 18 WILL SHOW 1970 AT X = 6 AND 1971 AT Xsl8 0 : 



80 172 264 355 445 536 629 721 



ENTER Y-AXIS TIC MARKS LOCATION VECTOR 
OR ENTER 0 FOR STANDARD TIC MARKS 
OR ENTER 1 FOR NO TIC MARKS 

E.G. 10 1 OR A VECTOR NAME OR 0 OR 1 10 1 WILL SHOW 

TIC MARKS AT Y= 1, Y= 0, AND Y=1 0 : 



0 



ENTER Y-AXIS SYMBOLS (IN QUOTES) 

OR ENTER 0 WITHOUT QUOTES FOR STANDARD SYMBOLS 
OR ENTER 1 WITHOUT QUOTES FOR NO SYMBOLS 
E.G. 'LO MID HI' OR VECTOR NAME OR 0 OR 1 0 : 



0 



ENTER Y-AXIS SYMBOLS LOCATIONS VECTOR 

OR ENTER 0 FOR SYMBOLS AT DEFAULT LOCATIONS OR NO SYMBOLS 
I.E. 101 OR VECTOR NAME OR 0 " 1 0 1 WILL SHOW LO AT Y= 
1, MID AT Y=0 , HI AT Y= 1 0 : 



0 



THESE AXIS CONTROL ENTRIES WILL NOW BE RUN 



DO YOU WANT TO RERUN THE PLOT INPUTS YOU ENTERED 
BEFORE RUNNING THIS AXIS CONTROL FUNCTION? 

ENTER 0 FOR NO OR 1 FOR YES 0 : 



0 
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DO YOU WANT TO RERUN THE PLOT INPUTS YOU ENTERED 
BEFORE RUNNING THIS AXIS CONTROL FUNCTION? 

ENTER 0 FOR NO OR 1 FOR YES 0 : 



1 



WITH THE PARTIAL PLOT THAT YOU HAVE JUST FINISHED 
CONSTRUCTING 

ENTER A 3 ELEMENT VECTOR FOR PARTIAL PLOT 

1ST ELEMENT, 1(0): LINES AND SYMBOLS ARE (NOT) SHOWN ON 

SCREEN 

2ND ELEMENT, 1(0): HEADER AND AXES ARE (NOT) SHOWN ON 
SCREEN 

3RD ELEMENT, 1(0): AXES , GRIDS, AND GRID LINES ARE (NOT) 

SHOWN 

E.G. 1 1 0 WILL SHOW EVERYTHING ON GRAPH EXCEPT AXES AND 
GRID LINES 0 : 



0 0 0 



ENTER A 4 ELEMENT VECTOR FOR AXES AND GRID CONTROL 
1ST ELEMENT, X-AXIS : 0 = BOTTOM, 2 = TOP, OR 20 h AT Y= 0 
2ND ELEMENT, Y-AXIS: 1 = LEFT, 3 s RIGHT, OR 21 = AT X = 0 
3RD ELEMENT, VERTICAL GRID LINES: 0 =NO GRID, 1 =DOTTED , OR 

2 =SOLID 

4 TH ELEMENT, HORIZON. GRID LINES: 0 =NO GRID, 1 =DOTTED , OR 

2= SO LID 

E.G. 2 1 2 2 WILL DISPLAY AXIS AT TOP AND LEFT AND SOLID 
GRID LINES 0 : 



20 3 1 1 



PLEASE WAIT RUNNING PAGE 



ENTER X-AXIS TIC MARKS LOCATION VECTOR 

OR ENTER 0 FOR STANDARD TIC MARKS 

OR ENTER 1 FOR NO TIC MARKS 

E.G. 1 5 11 OR A VECTOR NAME OR 0 OR 1 

1 5 11 WILL SHOW TIC MARKS AT X=1 , X=5 , AND X=ll 0 : 



80 170 260 350 440 530 620 710 



ENTER X-AXIS SYMBOLS (IN QUOTES) 

OR ENTER 0 WITHOUT QUOTES FOR STANDARD SYMBOLS 

OR ENTER 1 WITHOUT QUOTES FOR NO SYMBOLS 

E.G. '1970:1971' OR A VECTOR NAME OR 0 OR 1 0 : 



1 



ENTER X- AXIS SYMBOLS LOCATIONS VECTOR 

OR ENTER 0 FOR SYMBOLS AT DEFAULT LOCATIONS OR NO SYMBOLS 
E.G. 6 18 OR A VECTOR NAME OR 0 

6 18 WILL SHOW 1970 AT X=6 AND 1971 AT X=18 0 : 



0 
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ENTER Y-AXIS TIC MARKS LOCATION VECTOR 
OR ENTER 0 FOR STANDARD TIC MARKS 
OR ENTER 1 FOR NO TIC MARKS 

E.G. 10 1 OR A VECTOR NAME OR 0 OR 1 “ 1 0 1 WILL SHOW 
TIC MARKS AT Y = 1, Y=0 , AND Y = 1 0 : 



1 



ENTER Y-AXIS SYMBOLS (IN QUOTES ) 

OR ENTER 0 WITHOUT QUOTES FOR STANDARD SYMBOLS 
OR ENTER 1 WITHOUT QUOTES FOR NO SYMBOLS 
E.G. 'LO MID HI ' OR VECTOR NAME OR 0 OR 1 0 : 



1 



ENTER Y-AXIS SYMBOLS LOCATIONS VECTOR 

OR ENTER 0 FOR SYMBOLS AT DEFAULT LOCATIONS OR NO SYMBOLS 
I.E.~ 1 0 1 OR VECTOR NAME OR 0 “ 1 0 1 WILL SHOW LO AT Ys 
1, MID AT Y= 0, HI AT Y = 1 0 : 



0 



THESE AXIS CONTROL ENTRIES WILL NOW BE RUN 
DO YOU WANT TO RERUN THE PLOT INPUTS YOU ENTERED 
BEFORE RUNNING THIS AXIS CONTROL FUNCTION? 

ENTER 0 FOR NO OR 1 FOR YES 0 : 



0 



DO YOU WANT TO DO ANOTHER AXIS CONTROL PAGE? 
ENTER 0 FOR NO OR 1 FOR YES 0 : 



0 



DO YOU WANT TO RESTART THE FUNCTION? 

IF YOU DO’ NOT YOU WILL EXIT THIS FUNCTION 
IF YOU EXIT THIS FUNCTION AND WANT TO RETAIN THIS WORK 
USE THE KEEP FUNCTION AND THEN YOU CAN RETURN TO CMS 
BY ENTERING )OFF HOLD 

IF YOU WANT TO RETURN TO CMS, SIMPLY ENTER )OFF HOLD AFTER 
EXIT 

ENTER 0 FOR EXIT OR 1 FOR RESTART 0 : 



0 
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